Spaces:
Running
Running
Create Ops-MM-embedding-v1-7B.json
#58
by
frozenc
- opened
No description provided.
Hello, I have two questions regarding the benchmark:
- In utils_v2.py, I noticed that ActivityNetQA is categorized under V-MRET. Shouldn't it be categorized under V-QA instead?
- In the official Vidore-v2 benchmark, the MIT Biomedical, Economics Macro, and ESG Restaurant Synthetic datasets are using Multilingual rather than English-only. Should we consider aligning the Vidore-v2 subtasks?
Thanks for your help!
Hi, thank your for pointing out the issue! It is a typo and we have fixed that.
We agreed to use the Multilingual version to align with the vidore v2 benchmark.
Thank you for your support!
MINGYISU
changed pull request status to
merged