mfromm commited on
Commit
428eae3
·
verified ·
1 Parent(s): 4c3ce7a

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +18 -17
index.html CHANGED
@@ -36,6 +36,24 @@
36
  </p>
37
  </div>
38
  </section>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  <section class="section">
41
  <div class="container content">
@@ -72,23 +90,6 @@
72
  </div>
73
  </section>
74
 
75
- <section class="section">
76
- <div class="container content">
77
- <h2 class="title is-3">📁 Available Artifacts</h2>
78
- <ul>
79
- <li>📄 Ground truth annotations in 35 languages</li>
80
- <li>🧠 Synthetic LLM-annotated dataset (14M+ documents)</li>
81
- <li>🪶 Lightweight annotation models:
82
- <ul>
83
- <li>JQL-Gemma</li>
84
- <li>JQL-Mistral</li>
85
- <li>JQL-Llama</li>
86
- </ul>
87
- </li>
88
- <li>🛠️ Training & inference scripts (coming soon)</li>
89
- </ul>
90
- </div>
91
- </section>
92
 
93
  <section class="section">
94
  <div class="container content">
 
36
  </p>
37
  </div>
38
  </section>
39
+
40
+ <section class="section">
41
+ <div class="container content">
42
+ <h2 class="title is-3">📊 Results</h2>
43
+ <ul>
44
+ <li><strong>✔️ Accuracy:</strong> Spearman’s ρ > 0.87 with human ground truth</li>
45
+ <li><strong>📈 Downstream LLM Training:</strong>
46
+ <ul>
47
+ <li>+7.2% benchmark performance improvement</li>
48
+ <li>+4.8% token retention vs. FineWeb2 heuristic filter</li>
49
+ <li>Effective threshold strategies: 0.6 and 0.7 quantile</li>
50
+ </ul>
51
+ </li>
52
+ <li><strong>⚡ Annotation Speed:</strong> ~11,000 docs/min (A100 GPU, avg. 690 tokens)</li>
53
+ </ul>
54
+ </div>
55
+ </section>
56
+
57
 
58
  <section class="section">
59
  <div class="container content">
 
90
  </div>
91
  </section>
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
  <section class="section">
95
  <div class="container content">