File size: 901 Bytes
84410cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
---
title: MTEB Human Evaluation Demo
emoji: πŸ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.23.3
app_file: app.py
pinned: false
---

# MTEB Human Evaluation Demo

This is a demo of the human evaluation interface for the MTEB (Massive Text Embedding Benchmark) project. It allows annotators to evaluate the relevance of documents for reranking tasks.

## How to use

1. Navigate to the "Demo" tab to try the interface with an example dataset (AskUbuntuDupQuestions)
2. Read the query at the top
3. For each document, assign a rank using the dropdown (1 = most relevant)
4. Submit your rankings
5. Navigate between samples using the Previous/Next buttons
6. Your annotations are saved automatically

## About MTEB Human Evaluation

This project aims to establish human performance benchmarks for MTEB tasks, helping to understand the realistic "ceiling" for embedding model performance.