Jerrycool commited on
Commit
047335f
·
verified ·
1 Parent(s): 2fd5333

Update src/about.py

Browse files
Files changed (1) hide show
  1. src/about.py +7 -5
src/about.py CHANGED
@@ -31,11 +31,13 @@ INTRODUCTION_TEXT = """
31
  """
32
  # ... (rest of your about.py content) ...
33
  LLM_BENCHMARKS_TEXT = """
34
- ## How Benchmarks Work
35
- Detailed information about the benchmarks...
 
 
36
 
37
- ## Reproducibility
38
- Commands to reproduce results...
39
  """
40
 
41
  EVALUATION_QUEUE_TEXT = """
@@ -68,6 +70,6 @@ Make sure you have followed the above steps first.
68
  If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
69
  """
70
 
71
- CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
72
  CITATION_BUTTON_TEXT = r"""
73
  """
 
31
  """
32
  # ... (rest of your about.py content) ...
33
  LLM_BENCHMARKS_TEXT = """
34
+ ## MLE-Dojo
35
+ MLE-Dojo, a Gym-style framework for systematically training, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows.
36
+ Unlike existing benchmarks that primarily rely on static datasets or single-attempt evaluations, MLE-Dojo provides an interactive environment enabling agents to iteratively experiment, debug, and refine solutions through structured feedback loops. Built upon 200+ real-world Kaggle challenges (e.g., tabular data analysis, computer vision, natural language processing, and time series forecasting). MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios such as data processing, architecture search, hyperparameter tuning, and code debugging.
37
+ Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning, facilitating iterative experimentation, realistic data sampling, and real-time outcome verification.
38
 
39
+ ## New Models
40
+ We actively maintain this as a long-term real-time leaderboard with updated models and evaluation tasks to foster community-driven innovation.
41
  """
42
 
43
  EVALUATION_QUEUE_TEXT = """
 
70
  If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
71
  """
72
 
73
+ CITATION_BUTTON_LABEL = "Copy the following snippet to cite the paper."
74
  CITATION_BUTTON_TEXT = r"""
75
  """