nvidia
/

Nemotron-Research-Reasoning-Qwen-1.5B

Text Generation

text-generation-inference

Model card Files Files and versions Community

shizhediao2 commited on 8 days ago

Commit

27aa38b

·

1 Parent(s): 4ee8bb4

fixed typos

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -14,9 +14,9 @@ base_model:
 </div>
 ## Introduction
-Nemotron-Research-Reasoning-Qwen-1.5B is the world’s leading 1.5B open-weight model for complex reasoning tasks such as mathematical problems, coding challenges, and scientific questions.
 It is trained using the ProRL algorithm on a diverse and comprehensive set of datasets.
-Our model has achieved impressive results, outperforming Deepseek’s model by a large margin on a broad range of tasks including math, coding, and GPQA.
 This model is for research and development only.
@@ -53,10 +53,10 @@ Labeling Method by dataset:  <br>
 ## Evaluation Results
 Table 1: Performance (pass@1) comparison for benchmarks across Math domain.
-| Model                          | AIME24 | AIME25 | AMC   | Math  | Minverva | Olympiad | Avg   |
 |-------------------------------|--------|--------|-------|-------|----------|----------|--------|
 | DeepSeek-R1-Distill-Qwen-1.5B | 28.54  | 22.71  | 62.58 | 82.90 | 26.38    | 43.58    | 44.45  |
-| DeepScaler-1.5B               | 40.21  | 31.46  | 73.04 | 89.36 | 41.57    | 51.63    | 54.54  |
 | *DeepSeek-R1-Distill-Qwen-7B* | 53.54  | 40.83  | 82.83 | 93.68 | 50.60    | 57.66    | 63.19  |
 | **Nemotron-Research-Reasoning-Qwen-1.5B**                 | **48.13** | **33.33** | **79.29** | **91.89** | **47.98** | **60.22** | **60.14** |

 </div>
 ## Introduction
+Nemotron-Research-Reasoning-Qwen-1.5B is the world’s leading 1.5B open-weight model for complex reasoning tasks such as mathematical problems, coding challenges, scientific questions, and logic puzzles.
 It is trained using the ProRL algorithm on a diverse and comprehensive set of datasets.
+Our model has achieved impressive results, outperforming Deepseek’s 1.5B model by a large margin on a broad range of tasks, including math, coding, and GPQA.
 This model is for research and development only.
 ## Evaluation Results
 Table 1: Performance (pass@1) comparison for benchmarks across Math domain.
+| Model                          | AIME24 | AIME25 | AMC   | Math  | Minerva | Olympiad | Avg   |
 |-------------------------------|--------|--------|-------|-------|----------|----------|--------|
 | DeepSeek-R1-Distill-Qwen-1.5B | 28.54  | 22.71  | 62.58 | 82.90 | 26.38    | 43.58    | 44.45  |
+| DeepScaleR-1.5B               | 40.21  | 31.46  | 73.04 | 89.36 | 41.57    | 51.63    | 54.54  |
 | *DeepSeek-R1-Distill-Qwen-7B* | 53.54  | 40.83  | 82.83 | 93.68 | 50.60    | 57.66    | 63.19  |
 | **Nemotron-Research-Reasoning-Qwen-1.5B**                 | **48.13** | **33.33** | **79.29** | **91.89** | **47.98** | **60.22** | **60.14** |