Spaces:
Running
Running
Update src/about.py
Browse files- src/about.py +7 -5
src/about.py
CHANGED
@@ -31,11 +31,13 @@ INTRODUCTION_TEXT = """
|
|
31 |
"""
|
32 |
# ... (rest of your about.py content) ...
|
33 |
LLM_BENCHMARKS_TEXT = """
|
34 |
-
##
|
35 |
-
|
|
|
|
|
36 |
|
37 |
-
##
|
38 |
-
|
39 |
"""
|
40 |
|
41 |
EVALUATION_QUEUE_TEXT = """
|
@@ -68,6 +70,6 @@ Make sure you have followed the above steps first.
|
|
68 |
If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
|
69 |
"""
|
70 |
|
71 |
-
CITATION_BUTTON_LABEL = "Copy the following snippet to cite
|
72 |
CITATION_BUTTON_TEXT = r"""
|
73 |
"""
|
|
|
31 |
"""
|
32 |
# ... (rest of your about.py content) ...
|
33 |
LLM_BENCHMARKS_TEXT = """
|
34 |
+
## MLE-Dojo
|
35 |
+
MLE-Dojo, a Gym-style framework for systematically training, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows.
|
36 |
+
Unlike existing benchmarks that primarily rely on static datasets or single-attempt evaluations, MLE-Dojo provides an interactive environment enabling agents to iteratively experiment, debug, and refine solutions through structured feedback loops. Built upon 200+ real-world Kaggle challenges (e.g., tabular data analysis, computer vision, natural language processing, and time series forecasting). MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios such as data processing, architecture search, hyperparameter tuning, and code debugging.
|
37 |
+
Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning, facilitating iterative experimentation, realistic data sampling, and real-time outcome verification.
|
38 |
|
39 |
+
## New Models
|
40 |
+
We actively maintain this as a long-term real-time leaderboard with updated models and evaluation tasks to foster community-driven innovation.
|
41 |
"""
|
42 |
|
43 |
EVALUATION_QUEUE_TEXT = """
|
|
|
70 |
If everything is done, check you can launch the EleutherAIHarness on your model locally, using the above command without modifications (you can add `--limit` to limit the number of examples per task).
|
71 |
"""
|
72 |
|
73 |
+
CITATION_BUTTON_LABEL = "Copy the following snippet to cite the paper."
|
74 |
CITATION_BUTTON_TEXT = r"""
|
75 |
"""
|