File size: 1,448 Bytes
bb49073
 
 
 
 
 
 
 
b4c527a
 
 
 
b225c11
 
b4c527a
 
 
 
 
 
b225c11
 
 
 
 
 
 
 
 
 
 
 
 
 
556bad3
b225c11
 
 
 
 
 
 
 
556bad3
 
b225c11
b4c527a
556bad3
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
license: mit
datasets:
- KorQuAD/squad_kor_v1
language:
- ko
metrics:
- accuracy
---

# ๐ŸŠ DPR-KO

## 1. Intro

**ํ•œ๊ตญ์–ด DPR ๋ชจ๋ธ (Question Encoder)** ์ž…๋‹ˆ๋‹ค.     
Facebook์˜ DPR ์ฝ”๋“œ์™€๋Š” ์ „ํ˜€ ๋‹ค๋ฅธ ์ƒˆ๋กœ์šด ์ฝ”๋“œ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.    
Dense Vector ๊ธฐ๋ฐ˜์˜ Semantic Search์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.             
์งˆ๋ฌธ์€ Question Encoder๋กœ, ํ…์ŠคํŠธ๋Š” Context Encoder๋ฅผ ์ด์šฉํ•ด ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค. 

- Github: [https://github.com/snumin44/DPR-KO](https://github.com/snumin44/DPR-KO)
- Context Encoder: [https://huggingface.co/snumin44/biencoder-ko-bert-context](https://huggingface.co/snumin44/biencoder-ko-bert-context)


## 2. Experiment settings

- ๋ฒ ์ด์Šค ๋ชจ๋ธ: klue/bert-base     
- ๋ฐ์ดํ„ฐ ์…‹: KorQuad v1     
- ์œ„ํ‚ค ๋คํ”„: kowiki-latest-pages-articles.xml.bz2 (2024/07/23)     
- ์ฒญํฌ ๋‹น ๋ฌธ์žฅ: 5    
- ์ „์ฒด ์ฒญํฌ: ์•ฝ 160 ๋งŒ     
- BM25 ๊ฐ€์ค‘์น˜: 0.3    
- 1 A100 GPU     

## 3. Performance

|(%)|BM25 (w/o DPR-KO)|DPR-KO (w/o BM25)|DPR-KO (with BM25)|
|:---:|:---:|:---:|:---:|
|Top1 Acc|36.25 |**48.98** |71.16 |
|Top5 Acc|51.61 |**71.16** |86.75 |
|Top10 Acc|57.34 |**77.05** |90.28 |
|Top20 Acc|62.40 |**82.09** |92.66 |
|Top50 Acc|68.46 |**87.03** |94.86 |
|Top100 Acc|72.48 |**90.23** |96.02 |

โ€ป BM25๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ์œ„ํ‚คํ”ผ๋””์•„ ์ „์ฒด ํ…์ŠคํŠธ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.    
โ€ป ์ž์„ธํ•œ ์ฝ”๋“œ๋Š” Github ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.    

## Citing
```
```