MetaMetrics-RM-v1.0 (ICLR 2025)

Authors Genta Indra Winata, David Anugraha, Lucky Susanto, Garry Kuwanto, Derry Tanti Wijaya
Arxiv Paper: https://arxiv.org/abs/2410.02381
ICLR Paper: https://openreview.net/forum?id=slO3xTt4CG
Model: meta-metrics/MetaMetrics-RM-v1.0
Dataset:
- natolambert/skywork-preferences-80k-v0.1-cleaned
- allenai/preference-test-sets
Code Repository: https://github.com/meta-metrics/metametrics

RewardBench LeaderBoard

Model	Score	Chat	Chat Hard	Safety	Reasoning
nvidia/Llama-3.1-Nemotron-70B-Reward	94.1	97.5	85.7	95.1	98.1
meta-metrics/MetaMetrics-RM-v1.0	93.5	98.9	86.2	90.7	98.2
SF-Foundation/TextEval-Llama3.1-70B	93.5	94.1	90.1	93.2	96.4
RLHFlow/ArmoRM-Llama3-8B-v0.1	90.4	96.9	76.8	90.5	97.3

Citation

If you find this work useful for your research, please consider citing:

@inproceedings{
  winata2025metametrics,
  title={MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences},
  author={Genta Indra Winata and David Anugraha and Lucky Susanto and Garry Kuwanto and Derry Tanti Wijaya},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=slO3xTt4CG}
}

meta-metrics
/

MetaMetrics-RM-v1.0

MetaMetrics-RM-v1.0 (ICLR 2025)

RewardBench LeaderBoard

Citation

Datasets used to train meta-metrics/MetaMetrics-RM-v1.0