davidpomerenke commited on
Commit
d9553ba
·
verified ·
1 Parent(s): a0d1624

Upload from GitHub Actions: Merge pull request #19 from datenlabor-bmz/pr-17

Browse files
README.md CHANGED
@@ -43,18 +43,6 @@ For tag meaning, see https://huggingface.co/spaces/leaderboards/LeaderboardsExpl
43
 
44
  _Tracking language proficiency of AI models for every language_
45
 
46
- ## System Architecture
47
-
48
- The AI Language Monitor evaluates language models across 100+ languages using a comprehensive pipeline that combines model discovery, automated evaluation, and real-time visualization.
49
-
50
- > **Detailed Architecture**: See [system_architecture_diagram.md](system_architecture_diagram.md) for the complete system architecture diagram and component descriptions.
51
-
52
- **Key Features:**
53
- - **Model Discovery**: Combines curated models with real-time trending models via web scraping
54
- - **Multi-Task Evaluation**: 7 tasks across 100+ languages with origin tracking (human vs machine-translated)
55
- - **Scalable Architecture**: Dual deployment (local/GitHub vs Google Cloud)
56
- - **Real-time Visualization**: Interactive web interface with country-level insights
57
-
58
  ## Evaluate
59
 
60
  ### Local Development
@@ -68,3 +56,7 @@ uv run --extra dev evals/main.py
68
  uv run evals/backend.py
69
  cd frontend && npm i && npm start
70
  ```
 
 
 
 
 
43
 
44
  _Tracking language proficiency of AI models for every language_
45
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ## Evaluate
47
 
48
  ### Local Development
 
56
  uv run evals/backend.py
57
  cd frontend && npm i && npm start
58
  ```
59
+
60
+ ## System Architecture
61
+
62
+ See [system_architecture_diagram.md](system_architecture_diagram.md) for the complete system architecture diagram and component descriptions.
frontend/src/App.js CHANGED
@@ -142,7 +142,7 @@ function App () {
142
  }}
143
  >
144
  <strong>Work in Progress:</strong> This dashboard is currently under
145
- active development. Evaluation results are not yet final. Note that the visualised results currently stem from sampling 20 instances per combination of model, task, and language. We have evaluated 139 languages across 41 models and 7 tasks, totaling over 300,000 individual evaluations. Only the top 150 languages by speaker count are included in the current evaluation scope. More extensive evaluation runs will be released later this year.
146
  </div>
147
  <div
148
  style={{
 
142
  }}
143
  >
144
  <strong>Work in Progress:</strong> This dashboard is currently under
145
+ active development. Evaluation results are not yet final. More extensive evaluation runs will be released later this year.
146
  </div>
147
  <div
148
  style={{
system_architecture_diagram.md CHANGED
@@ -1,5 +1,7 @@
1
  # AI Language Monitor - System Architecture
2
 
 
 
3
  This diagram shows the complete data flow from model discovery through evaluation to frontend visualization.
4
 
5
  ```mermaid
 
1
  # AI Language Monitor - System Architecture
2
 
3
+ \[AI-generated, not 100% up-to-date\]
4
+
5
  This diagram shows the complete data flow from model discovery through evaluation to frontend visualization.
6
 
7
  ```mermaid