--- license: mit datasets: - KorQuAD/squad_kor_v1 language: - ko metrics: - accuracy --- # ๐ŸŠ DPR-KO ## 1. Intro **ํ•œ๊ตญ์–ด DPR ๋ชจ๋ธ (Question Encoder)** ์ž…๋‹ˆ๋‹ค. Facebook์˜ DPR ์ฝ”๋“œ์™€๋Š” ์ „ํ˜€ ๋‹ค๋ฅธ ์ƒˆ๋กœ์šด ์ฝ”๋“œ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. Dense Vector ๊ธฐ๋ฐ˜์˜ Semantic Search์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ๋ฌธ์€ Question Encoder๋กœ, ํ…์ŠคํŠธ๋Š” Context Encoder๋ฅผ ์ด์šฉํ•ด ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค. - Github: [https://github.com/snumin44/DPR-KO](https://github.com/snumin44/DPR-KO) - Context Encoder: [https://huggingface.co/snumin44/biencoder-ko-bert-context](https://huggingface.co/snumin44/biencoder-ko-bert-context) ## 2. Experiment settings - ๋ฒ ์ด์Šค ๋ชจ๋ธ: klue/bert-base - ๋ฐ์ดํ„ฐ ์…‹: KorQuad v1 - ์œ„ํ‚ค ๋คํ”„: kowiki-latest-pages-articles.xml.bz2 (2024/07/23) - ์ฒญํฌ ๋‹น ๋ฌธ์žฅ: 5 - ์ „์ฒด ์ฒญํฌ: ์•ฝ 160 ๋งŒ - BM25 ๊ฐ€์ค‘์น˜: 0.3 - 1 A100 GPU ## 3. Performance |(%)|BM25 (w/o DPR-KO)|DPR-KO (w/o BM25)|DPR-KO (with BM25)| |:---:|:---:|:---:|:---:| |Top1 Acc|36.25 |**48.98** |71.16 | |Top5 Acc|51.61 |**71.16** |86.75 | |Top10 Acc|57.34 |**77.05** |90.28 | |Top20 Acc|62.40 |**82.09** |92.66 | |Top50 Acc|68.46 |**87.03** |94.86 | |Top100 Acc|72.48 |**90.23** |96.02 | โ€ป BM25๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ์œ„ํ‚คํ”ผ๋””์•„ ์ „์ฒด ํ…์ŠคํŠธ๋กœ ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. โ€ป ์ž์„ธํ•œ ์ฝ”๋“œ๋Š” Github ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. ## Citing ``` ```