Spaces:

TIGER-Lab
/

MMEB-Leaderboard

Running

In utils_v2.py, I noticed that ActivityNetQA is categorized under V-MRET. Shouldn't it be categorized under V-QA instead?
In the official Vidore-v2 benchmark, the MIT Biomedical, Economics Macro, and ESG Restaurant Synthetic datasets are using Multilingual rather than English-only. Should we consider aligning the Vidore-v2 subtasks?

Thanks for your help!

MINGYISU

TIGER-Lab org 28 days ago

Hi, thank your for pointing out the issue! It is a typo and we have fixed that.
We agreed to use the Multilingual version to align with the vidore v2 benchmark.
Thank you for your support!

MINGYISU changed pull request status to merged 28 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment