Spaces:

fair-forward
/

evals-for-every-language

Runtime error

davidpomerenke commited on about 11 hours ago

Commit

d9553ba

verified ·

1 Parent(s): a0d1624

Upload from GitHub Actions: Merge pull request #19 from datenlabor-bmz/pr-17

Files changed (3) hide show

README.md CHANGED Viewed

@@ -43,18 +43,6 @@ For tag meaning, see https://huggingface.co/spaces/leaderboards/LeaderboardsExpl
 _Tracking language proficiency of AI models for every language_
-## System Architecture
-The AI Language Monitor evaluates language models across 100+ languages using a comprehensive pipeline that combines model discovery, automated evaluation, and real-time visualization.
-> **Detailed Architecture**: See [system_architecture_diagram.md](system_architecture_diagram.md) for the complete system architecture diagram and component descriptions.
-**Key Features:**
-- **Model Discovery**: Combines curated models with real-time trending models via web scraping
-- **Multi-Task Evaluation**: 7 tasks across 100+ languages with origin tracking (human vs machine-translated)
-- **Scalable Architecture**: Dual deployment (local/GitHub vs Google Cloud)
-- **Real-time Visualization**: Interactive web interface with country-level insights
 ## Evaluate
 ### Local Development
@@ -68,3 +56,7 @@ uv run --extra dev evals/main.py
 uv run evals/backend.py
 cd frontend && npm i && npm start
 ```

 _Tracking language proficiency of AI models for every language_
 ## Evaluate
 ### Local Development
 uv run evals/backend.py
 cd frontend && npm i && npm start
 ```
+## System Architecture
+See [system_architecture_diagram.md](system_architecture_diagram.md) for the complete system architecture diagram and component descriptions.

frontend/src/App.js CHANGED Viewed

@@ -142,7 +142,7 @@ function App () {
           }}
         >
           <strong>Work in Progress:</strong> This dashboard is currently under
-          active development. Evaluation results are not yet final. Note that the visualised results currently stem from sampling 20 instances per combination of model, task, and language. We have evaluated 139 languages across 41 models and 7 tasks, totaling over 300,000 individual evaluations. Only the top 150 languages by speaker count are included in the current evaluation scope. More extensive evaluation runs will be released later this year.
         </div>
         <div
           style={{

           }}
         >
           <strong>Work in Progress:</strong> This dashboard is currently under
+          active development. Evaluation results are not yet final. More extensive evaluation runs will be released later this year.
         </div>
         <div
           style={{

system_architecture_diagram.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # AI Language Monitor - System Architecture
 This diagram shows the complete data flow from model discovery through evaluation to frontend visualization.
 ```mermaid

 # AI Language Monitor - System Architecture
+\[AI-generated, not 100% up-to-date\]
 This diagram shows the complete data flow from model discovery through evaluation to frontend visualization.
 ```mermaid