Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
princeton-nlp 's Collections
SimPO
SWE-bench
ProLong
Sheared Llama
SimCSE

SWE-bench

updated Mar 8

SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues.

Upvote
4

  • princeton-nlp/SWE-bench

    Viewer • Updated Mar 3 • 21.5k • 24.8k • 112

  • princeton-nlp/SWE-bench_Lite

    Viewer • Updated Mar 3 • 323 • 35.9k • 37

  • princeton-nlp/SWE-bench_Multimodal

    Viewer • Updated Jan 13 • 612 • 315 • 21

  • princeton-nlp/SWE-bench_Verified

    Viewer • Updated Feb 18 • 500 • 277k • 178
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs