zwhe99 commited on
Commit
e234ce1
·
verified ·
1 Parent(s): 27cf49b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -18
README.md CHANGED
@@ -119,23 +119,28 @@ DeepMath-1.5B is created by finetuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
119
 
120
  <sub>Difficulty distribution comparison.</sub> </div>
121
 
122
- **2. Broad Topical Diversity**: The dataset spans a wide spectrum of mathematical subjects, including Algebra, Calculus, Number Theory, Geometry, Probability, and Discrete Mathematics.
123
 
124
  <div align="center"> <img src="./assets/github-domain.png" width="50%"/>
125
 
126
  <sub>Hierarchical breakdown of mathematical topics covered in DeepMath-103K.</sub></div>
127
 
128
- **4. Rigorous Decontamination**: Built from diverse sources, the dataset underwent meticulous decontamination against common benchmarks using semantic matching. This minimizes test set leakage and promotes fair model evaluation.
 
 
 
 
 
129
 
130
  <div align="center"> <img src="./assets/github-contamination-case.png" width="80%"/>
131
 
132
  <sub>Detected contamination examples. Subtle conceptual overlaps can also be identified.</sub> </div>
133
 
134
- **5. Rich Data Format**: Each sample in `DeepMath-103K` is structured with rich information to support various research applications:
135
 
136
  <div align="center"> <img src="./assets/github-data-sample.png" width="90%"/>
137
 
138
- <sub>A data sample from DeepMath-103K.</sub> </div>
139
 
140
  - **Question**: The mathematical problem statement.
141
  - **Final Answer**: A reliably verifiable final answer, enabling robust rule-based reward functions for RL.
@@ -145,22 +150,73 @@ DeepMath-1.5B is created by finetuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
145
 
146
  ## 📊Main Results
147
 
148
- `DeepMath-Zero-7B` and `DeepMath-1.5B` are trained on the `DeepMath-103K` dataset via RL. These models are initialized from `Qwen2.5-7B-Base` and `R1-Distill-Qwen-1.5B`, respectively.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
 
151
- | Model | MATH 500 | AMC23 | Olympiad Bench | Minerva Math | AIME24 | AIME25 |
152
- | :----------------------: | :------: | :------: | :------------: | :----------: | :------: | :------: |
153
- | Qwen2.5-7B-Base | 54.8 | 35.3 | 27.8 | 16.2 | 7.7 | 5.4 |
154
- | Open-Reasoner-Zero-7B | 81.8 | 58.9 | 47.9 | 38.4 | 15.6 | 14.4 |
155
- | Qwen-2.5-7B-SimpleRL-Zoo | 77.0 | 55.8 | 41.0 | 41.2 | 15.6 | 8.7 |
156
- | [DeepMath-Zero-7B](https://huggingface.co/zwhe99/DeepMath-Zero-7B) | **85.5** | **64.7** | **51.0** | **45.3** | **20.4** | **17.5** |
157
 
158
- | Model | MATH 500 | AMC23 | Olympiad Bench | Minerva Math | AIME24 | AIME25 |
159
- | :---------------------: | :------: | :------: | :------------: | :----------: | :------: | :------: |
160
- | R1-Distill-Qwen-1.5B | 84.7 | 72.0 | 53.1 | 36.6 | 29.4 | 24.8 |
161
- | DeepScaleR-1.5B-Preview | 89.4 | 80.3 | 60.9 | 42.2 | **42.3** | 29.6 |
162
- | Still-3-1.5B-Preview | 86.6 | 75.8 | 55.7 | 38.7 | 30.8 | 24.6 |
163
- | [DeepMath-1.5B](https://huggingface.co/zwhe99/DeepMath-1.5B) | **89.9** | **82.3** | **61.8** | **42.5** | 37.3 | **30.8** |
164
 
165
  ## 🙏 Acknowledgements
166
 
@@ -171,6 +227,8 @@ This work can not be done without the help of the following works:
171
  - **[TIGER-Lab/WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)**: Instruction data from MathStackExchange and ScienceStackExchange.
172
  - **[AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)**: Approximately 860k math problems.
173
 
 
 
174
  ## 📚 Citation
175
  ```bibtex
176
  @article{deepmath,
@@ -182,4 +240,4 @@ This work can not be done without the help of the following works:
182
  primaryClass={cs.CL},
183
  url={https://arxiv.org/abs/2504.11456},
184
  }
185
- ```
 
119
 
120
  <sub>Difficulty distribution comparison.</sub> </div>
121
 
122
+ **2. Data Diversity and Novelty**: DeepMath-103K spans a wide spectrum of mathematical subjects, including Algebra, Calculus, Number Theory, Geometry, Probability, and Discrete Mathematics.
123
 
124
  <div align="center"> <img src="./assets/github-domain.png" width="50%"/>
125
 
126
  <sub>Hierarchical breakdown of mathematical topics covered in DeepMath-103K.</sub></div>
127
 
128
+ The problems in DeepMath-103K are novel and unique, whereas many existing datasets are similar and overlap.
129
+ <div align="center"> <img src="./assets/github-tsne.png" width="70%"/>
130
+
131
+ <sub>Embedding distributions of different datasets.</sub></div>
132
+
133
+ **3. Rigorous Decontamination**: Built from diverse sources, DeepMath-103K underwent meticulous decontamination against common benchmarks using semantic matching. This minimizes test set leakage and promotes fair model evaluation.
134
 
135
  <div align="center"> <img src="./assets/github-contamination-case.png" width="80%"/>
136
 
137
  <sub>Detected contamination examples. Subtle conceptual overlaps can also be identified.</sub> </div>
138
 
139
+ **4. Rich Data Format**: Each sample in DeepMath-103K is structured with rich information to support various research applications:
140
 
141
  <div align="center"> <img src="./assets/github-data-sample.png" width="90%"/>
142
 
143
+ <sub>An example data sample from DeepMath-103K.</sub> </div>
144
 
145
  - **Question**: The mathematical problem statement.
146
  - **Final Answer**: A reliably verifiable final answer, enabling robust rule-based reward functions for RL.
 
150
 
151
  ## 📊Main Results
152
 
153
+ DeepMath serise models achieve many **SOTA** results on challenging math benchmarks:
154
+
155
+ <div align="center"> <img src="./assets/github-main.png" width="90%"/>
156
+
157
+ <sub>Math reasoning performance.</sub> </div>
158
+
159
+
160
+ ## 🎯Quick Start
161
+
162
+ #### Environment Preparation
163
+
164
+ ```shell
165
+ git clone --recurse-submodules https://github.com/zwhe99/DeepMath.git && cd DeepMath
166
+
167
+ conda create -y -n deepmath python=3.12.2 && conda activate deepmath
168
+ pip3 install ray[default]
169
+ pip3 install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
170
+ pip3 install flash-attn==2.7.4.post1 --no-build-isolation
171
+ pip3 install omegaconf==2.4.0.dev3 hydra-core==1.4.0.dev1 antlr4-python3-runtime==4.11.0 vllm==0.7.3
172
+ pip3 install math-verify[antlr4_11_0]==0.7.0 fire deepspeed tensorboardX prettytable datasets transformers==4.49.0
173
+ pip3 install -e verl
174
+ ```
175
+
176
+
177
 
178
+ #### Evaluation
179
+
180
+ ```shell
181
+ VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_ATTENTION_BACKEND=XFORMERS VLLM_USE_V1=1 VLLM_WORKER_MULTIPROC_METHOD=spawn python3 uni_eval.py \
182
+ --base_model zwhe99/DeepMath-Zero-7B \
183
+ --chat_template_name orz \
184
+ --system_prompt_name simplerl \
185
+ --output_dir \
186
+ --bf16 True \
187
+ --tensor_parallel_size 8 \
188
+ --data_id zwhe99/MATH \
189
+ --split math500 \
190
+ --max_model_len 32768 \
191
+ --temperature 0.6 \
192
+ --top_p 0.95 \
193
+ --n 16
194
+ ```
195
+
196
+
197
+
198
+ #### Training
199
+
200
+ * Data Preparation
201
+
202
+ ```shell
203
+ DATA_DIR=/path/to/your/data
204
+ python3 verl/examples/data_preprocess/deepmath_103k.py --local_dir $DATA_DIR
205
+ ```
206
+
207
+ * Start Ray
208
+
209
+ ```shell
210
+ # Head node (×1)
211
+ ray start --head --port=6379 --node-ip-address=$HEAD_ADDR --num-gpus=8
212
+
213
+ # Worker nodes (×7 or ×11)
214
+ ray start --address=$HEAD_ADDR:6379 --node-ip-address=$WORKER_ADDR --num-gpus=8
215
+ ```
216
+
217
+ * Launch training at head node. See `scripts/train` for training scripts.
218
 
 
 
 
 
 
 
219
 
 
 
 
 
 
 
220
 
221
  ## 🙏 Acknowledgements
222
 
 
227
  - **[TIGER-Lab/WebInstructSub](https://huggingface.co/datasets/TIGER-Lab/WebInstructSub)**: Instruction data from MathStackExchange and ScienceStackExchange.
228
  - **[AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)**: Approximately 860k math problems.
229
 
230
+
231
+
232
  ## 📚 Citation
233
  ```bibtex
234
  @article{deepmath,
 
240
  primaryClass={cs.CL},
241
  url={https://arxiv.org/abs/2504.11456},
242
  }
243
+ ```