Satori-SWE

Satori
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
About US
Satori (悟り) is a Japanese term meaning "sudden enlightenment" or "awakening." The Satori team is dedicated to the pursuit of Artificial General Intelligence (AGI), with a particular focus on enhancing the reasoning capabilities of large language models (LLMs)—a crucial step toward this ultimate goal.
Along this journey, the Satori team has released two major research contributions:
- Satori (ICML 2025): Released concurrently with DeepSeek-R1, we propose a novel post-training paradigm that enables LLMs to performs an extended reasoning process with self-reflection: 1) a small-scale format tuning (FT) stage to internalize certain reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning (RL). Our approach results in Satori, a 7B LLM that achieves state-of-the-art reasoning performance.
- Satori-SWE: This work addresses a particularly challenging domain for LLMs: real-world software engineering (SWE) task. We propose Evolutionary Test-Time Scaling (EvoScale) that treats LLM generation as an evolutionary process. By combining reinforcement learning (RL) training and EvoScale test-time scaling, our 32B model, Satori-SWE-32B, achieves performance comparable to models exceeding 100B parameters, while requiring only a small number of samples.
Resources
If you are interested in our work, please refer to our blog and research paper for more technical details!
Citation
If you find our model and data helpful, please cite our paper:
Satori
@misc{shen2025satorireinforcementlearningchainofactionthought,
title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search},
author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
year={2025},
eprint={2502.02508},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.02508},
}
Satori-SWE
@misc{zeng2025satorisweevolutionarytesttimescaling,
title={Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering},
author={Guangtao Zeng and Maohao Shen and Delin Chen and Zhenting Qi and Subhro Das and Dan Gutfreund and David Cox and Gregory Wornell and Wei Lu and Zhang-Wei Hong and Chuang Gan},
year={2025},
eprint={2505.23604},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.23604},
}
Collections
2
models
5
datasets
5
Satori-reasoning/Satori-SWE-RL-data
Viewer
•
Updated
•
41k
•
87
•
1
Satori-reasoning/Satori-SWE-two-stage-SFT-data
Viewer
•
Updated
•
12.6k
•
94
•
2
Satori-reasoning/Satori_RL_data_with_RAE
Viewer
•
Updated
•
858k
•
10
Satori-reasoning/Satori_FT_data
Viewer
•
Updated
•
857k
•
57
Satori-reasoning/Satori_RL_data
Viewer
•
Updated
•
545k
•
37