xiao23451 commited on
Commit
341f780
·
verified ·
1 Parent(s): 458bf05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,3 +1,20 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen2.5-Math-7B
5
+ ---
6
+
7
+ ## Model ID
8
+
9
+ GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
10
+
11
+ https://arxiv.org/abs/2504.02546
12
+
13
+ ## Model Details
14
+
15
+ The RL model (GPG-7B in paper) trained on the simple1r_qwen_level3to5 dataset based on GPG, using Qwen2.5-Math-7B as the baseline model.
16
+
17
+ ## Attention!
18
+
19
+ Due to changes in environment and devices, test results may fluctuate. Specifically, when tested on an NPU, the average accuracy of five datasets (AIME24, AMC23, MATH-500, Minerva and OlympiadBench) is 57.7. However, when tested on an H20 GPU, the average accuracy drops from 57.7 to 55.3. These fluctuations are entirely within an acceptable range.
20
+