Satori-reasoning
/

Satori-7B-Round2

Text Generation

text-generation-inference

Model card Files Files and versions Community

chaoscodes commited on 7 days ago

Commit

8e848d6

·

verified ·

1 Parent(s): 8b4166d

Update README.md

Files changed (1) hide show

README.md +1 -10

README.md CHANGED Viewed

@@ -9,16 +9,7 @@ base_model:
 - Qwen/Qwen2.5-Math-7B
 ---
----
-license: apache-2.0
-library_name: transformers
-pipeline_tag: text-generation
-datasets:
-- Satori-reasoning/Satori_FT_data
-- Satori-reasoning/Satori_RL_data
-base_model:
-- Qwen/Qwen2.5-Math-7B
----
 **Satori-7B-Round2** is a 7B LLM trained on open-source model (Qwen-2.5-Math-7B) and open-source data (OpenMathInstruct-2 and NuminaMath). **Satori-7B-Round2** is capable of autoregressive search, i.e., self-reflection and self-exploration without external guidance.
 This is achieved through our proposed Chain-of-Action-Thought (COAT) reasoning and a two-stage post-training paradigm.

 - Qwen/Qwen2.5-Math-7B
 ---
 **Satori-7B-Round2** is a 7B LLM trained on open-source model (Qwen-2.5-Math-7B) and open-source data (OpenMathInstruct-2 and NuminaMath). **Satori-7B-Round2** is capable of autoregressive search, i.e., self-reflection and self-exploration without external guidance.
 This is achieved through our proposed Chain-of-Action-Thought (COAT) reasoning and a two-stage post-training paradigm.