|
--- |
|
license: mit |
|
datasets: |
|
- KorQuAD/squad_kor_v1 |
|
language: |
|
- ko |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
# ๐ DPR-KO |
|
|
|
## 1. Intro |
|
|
|
**ํ๊ตญ์ด DPR ๋ชจ๋ธ (Question Encoder)** ์
๋๋ค. |
|
Facebook์ DPR ์ฝ๋์๋ ์ ํ ๋ค๋ฅธ ์๋ก์ด ์ฝ๋๋ก ํ์ต๋์์ต๋๋ค. |
|
Dense Vector ๊ธฐ๋ฐ์ Semantic Search์ ์ฌ์ฉํ ์ ์์ต๋๋ค. |
|
์ง๋ฌธ์ Question Encoder๋ก, ํ
์คํธ๋ Context Encoder๋ฅผ ์ด์ฉํด ์ธ์ฝ๋ฉํฉ๋๋ค. |
|
|
|
- Github: [https://github.com/snumin44/DPR-KO](https://github.com/snumin44/DPR-KO) |
|
- Context Encoder: [https://huggingface.co/snumin44/biencoder-ko-bert-context](https://huggingface.co/snumin44/biencoder-ko-bert-context) |
|
|
|
|
|
## 2. Experiment settings |
|
|
|
- ๋ฒ ์ด์ค ๋ชจ๋ธ: klue/bert-base |
|
- ๋ฐ์ดํฐ ์
: KorQuad v1 |
|
- ์ํค ๋คํ: kowiki-latest-pages-articles.xml.bz2 (2024/07/23) |
|
- ์ฒญํฌ ๋น ๋ฌธ์ฅ: 5 |
|
- ์ ์ฒด ์ฒญํฌ: ์ฝ 160 ๋ง |
|
- BM25 ๊ฐ์ค์น: 0.3 |
|
- 1 A100 GPU |
|
|
|
## 3. Performance |
|
|
|
|(%)|BM25 (w/o DPR-KO)|DPR-KO (w/o BM25)|DPR-KO (with BM25)| |
|
|:---:|:---:|:---:|:---:| |
|
|Top1 Acc|36.25 |**48.98** |71.16 | |
|
|Top5 Acc|51.61 |**71.16** |86.75 | |
|
|Top10 Acc|57.34 |**77.05** |90.28 | |
|
|Top20 Acc|62.40 |**82.09** |92.66 | |
|
|Top50 Acc|68.46 |**87.03** |94.86 | |
|
|Top100 Acc|72.48 |**90.23** |96.02 | |
|
|
|
โป BM25๋ชจ๋ธ์ ํ๊ตญ์ด ์ํคํผ๋์ ์ ์ฒด ํ
์คํธ๋ก ํ์ตํ ๋ชจ๋ธ์
๋๋ค. |
|
โป ์์ธํ ์ฝ๋๋ Github ๋ฅผ ์ฐธ๊ณ ํด์ฃผ์ธ์. |
|
|
|
## Citing |
|
``` |
|
``` |
|
|