Add link to the paper on 🤗

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +58 -4
README.md CHANGED
@@ -1,12 +1,13 @@
1
  ---
2
- license: mit
3
  base_model:
4
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
5
- pipeline_tag: text-generation
6
- library_name: transformers
7
  datasets:
8
  - Zigeng/CoT-Veirification-340k
 
 
 
9
  ---
 
10
  <div align="center">
11
  <h1>🔍 VeriThinker: Learning to Verify Makes Reasoning Model Efficient</h1>
12
  </div>
@@ -46,6 +47,10 @@ datasets:
46
  <td>📊 <strong>Data</strong></td>
47
  <td><a href="https://huggingface.co/datasets/Zigeng/CoT-Veirification-340k">
48
  CoT-Veirification-340k</a></td>
 
 
 
 
49
  </tr>
50
  </tbody>
51
  </table>
@@ -115,7 +120,10 @@ model = AutoModelForCausalLM.from_pretrained(
115
  )
116
 
117
  # prepare the model input
118
- prompt_part_1 = "## Instruction:\nYou will be provided with a question along with a proposed solution. Please carefully verify each step of the solution, tell me if every step is absolutely correct.\n\n"
 
 
 
119
 
120
 
121
  prompt_part_2 = """## Question:
@@ -165,4 +173,50 @@ generated_ids = [
165
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
166
 
167
  print(response)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  ```
 
1
  ---
 
2
  base_model:
3
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 
 
4
  datasets:
5
  - Zigeng/CoT-Veirification-340k
6
+ library_name: transformers
7
+ license: mit
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
  <div align="center">
12
  <h1>🔍 VeriThinker: Learning to Verify Makes Reasoning Model Efficient</h1>
13
  </div>
 
47
  <td>📊 <strong>Data</strong></td>
48
  <td><a href="https://huggingface.co/datasets/Zigeng/CoT-Veirification-340k">
49
  CoT-Veirification-340k</a></td>
50
+ </tr>
51
+ <tr>
52
+ <td>📄 <strong>Paper (🤗)</strong></td>
53
+ <td><a href="https://huggingface.co/papers/2505.17941">Hugging Face Paper</a></td>
54
  </tr>
55
  </tbody>
56
  </table>
 
120
  )
121
 
122
  # prepare the model input
123
+ prompt_part_1 = "## Instruction:
124
+ You will be provided with a question along with a proposed solution. Please carefully verify each step of the solution, tell me if every step is absolutely correct.
125
+
126
+ "
127
 
128
 
129
  prompt_part_2 = """## Question:
 
173
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
174
 
175
  print(response)
176
+ ```
177
+
178
+ ## 🔥 Training
179
+ ### 1. Training with LoRA:
180
+ We provide training scripts for our proposed supervised verification fine-tuning approach. The implementation utilizes LoRA during the training process, with the configuration details specified in [config_lora_r1_7b.yaml](https://github.com/czg1225/VeriThinker/blob/main/config/config_lora_r1_7b.yaml).
181
+ ```bash
182
+ deepspeed --include localhost:0,1,2,3,4,5,6,7 train_svft.py
183
+ ```
184
+
185
+ ### 2. LoRA Merge:
186
+ After training, merge the LoRA weights to get the reasoning model.
187
+ ```bash
188
+ python merge_lora.py
189
+ ```
190
+
191
+ ## ⚡ Evaluation:
192
+ We provide evaluation scripts for three mathematical datasets: MATH500, AIME 2024, and AIME 2025. Our implementation leverages the [vLLM](https://docs.vllm.ai/en/latest/) framework to ensure efficient inference during evaluation.
193
+
194
+ ### 1. Evaluation on MATH500 Dataset
195
+ ```bash
196
+ CUDA_VISIBLE_DEVICES=0,1,2,3 python eval_math500.py
197
+ ```
198
+ ### 2. Evaluation on AIME 2024 Dataset
199
+ ```bash
200
+ CUDA_VISIBLE_DEVICES=0,1,2,3 python eval_aime24.py
201
+ ```
202
+ ### 3. Evaluation on AIME 2025 Dataset
203
+ ```bash
204
+ CUDA_VISIBLE_DEVICES=0,1,2,3 python eval_aime25.py
205
+ ```
206
+
207
+ ## 📖 Experimental Results
208
+ ### CoT Compression Results:
209
+ ![CoT Compression](assets/cot-compression.png)
210
+
211
+ ### CoT Correctness Verification Results:
212
+ ![CoT Correctness](assets/cot-correctness.png)
213
+
214
+ ### Speculative Reasoning Results:
215
+ Speculative reasoning results on three reasoning models. When using Qwen-2.5-Math-Instruct-7B as the draft model, most problems in MATH500 and GSM8K can be solved with short CoT model, while only a few (around 10%) require activation of the long CoT model for more complex solutions.
216
+ ![CoT Speculative1](assets/cot-spec1.png)
217
+ ![CoT Speculative2](assets/cot-spec2.png)
218
+
219
+ ## Citation
220
+ If our research assists your work, please give us a star ⭐ or cite us using:
221
+ ```
222
  ```