Spaces:

AdnanElAssadi
/

MTEB-Human-Eval-Demo

Sleeping

AdnanElAssadi commited on Apr 6

Commit

84410cb

verified ·

1 Parent(s): 5ec0494

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,27 +1,27 @@
----
-title: MTEB Human Evaluation Demo
-emoji: 📊
-colorFrom: blue
-colorTo: indigo
-sdk: gradio
-sdk_version: 3.50.2
-app_file: app.py
-pinned: false
----
-# MTEB Human Evaluation Demo
-This is a demo of the human evaluation interface for the MTEB (Massive Text Embedding Benchmark) project. It allows annotators to evaluate the relevance of documents for reranking tasks.
-## How to use
-1. Navigate to the "Demo" tab to try the interface with an example dataset (AskUbuntuDupQuestions)
-2. Read the query at the top
-3. For each document, assign a rank using the dropdown (1 = most relevant)
-4. Submit your rankings
-5. Navigate between samples using the Previous/Next buttons
-6. Your annotations are saved automatically
-## About MTEB Human Evaluation
-This project aims to establish human performance benchmarks for MTEB tasks, helping to understand the realistic "ceiling" for embedding model performance.

+---
+title: MTEB Human Evaluation Demo
+emoji: 📊
+colorFrom: blue
+colorTo: indigo
+sdk: gradio
+sdk_version: 5.23.3
+app_file: app.py
+pinned: false
+---
+# MTEB Human Evaluation Demo
+This is a demo of the human evaluation interface for the MTEB (Massive Text Embedding Benchmark) project. It allows annotators to evaluate the relevance of documents for reranking tasks.
+## How to use
+1. Navigate to the "Demo" tab to try the interface with an example dataset (AskUbuntuDupQuestions)
+2. Read the query at the top
+3. For each document, assign a rank using the dropdown (1 = most relevant)
+4. Submit your rankings
+5. Navigate between samples using the Previous/Next buttons
+6. Your annotations are saved automatically
+## About MTEB Human Evaluation
+This project aims to establish human performance benchmarks for MTEB tasks, helping to understand the realistic "ceiling" for embedding model performance.