AI & ML interests

None defined yet.

Recent Activity

chaoscodes  updated a dataset about 3 hours ago
Satori-reasoning/Satori-SWE-RL-data
maohaos2  updated a model 4 days ago
Satori-reasoning/Satori-RM-7B
maohaos2  updated a model 4 days ago
Satori-reasoning/Satori-SFT-7B
View all activity

About US

Satori (悟り) is a Japanese term meaning "sudden enlightenment" or "awakening." The Satori team is dedicated to the pursuit of Artificial General Intelligence (AGI), with a particular focus on enhancing the reasoning capabilities of large language models (LLMs)—a crucial step toward this ultimate goal.

Along this journey, the Satori team has released two major research contributions:

  • Satori (ICML 2025): Released concurrently with DeepSeek-R1, we propose a novel post-training paradigm that enables LLMs to performs an extended reasoning process with self-reflection: 1) a small-scale format tuning (FT) stage to internalize certain reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning (RL). Our approach results in Satori, a 7B LLM that achieves state-of-the-art reasoning performance.
  • Satori-SWE: This work addresses a particularly challenging domain for LLMs: real-world software engineering (SWE) task. We propose Evolutionary Test-Time Scaling (EvoScale) that treats LLM generation as an evolutionary process. By combining reinforcement learning (RL) training and EvoScale test-time scaling, our 32B model, Satori-SWE-32B, achieves performance comparable to models exceeding 100B parameters, while requiring only a small number of samples.

Resources

If you are interested in our work, please refer to our blog and research paper for more technical details!

Citation

If you find our model and data helpful, please cite our paper:

Satori

@misc{shen2025satorireinforcementlearningchainofactionthought,
      title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search}, 
      author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
      year={2025},
      eprint={2502.02508},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.02508}, 
}

Satori-SWE

@misc{zeng2025satorisweevolutionarytesttimescaling,
      title={Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering}, 
      author={Guangtao Zeng and Maohao Shen and Delin Chen and Zhenting Qi and Subhro Das and Dan Gutfreund and David Cox and Gregory Wornell and Wei Lu and Zhang-Wei Hong and Chuang Gan},
      year={2025},
      eprint={2505.23604},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.23604}, 
}