4 11

Linfeng Song

freesunshine0316

https://freesunshine0316.github.io/

AI & ML interests

Researcher @Tencent AI Lab working on reasoning and RLAIF with LLM, especially search + RL. Working on NLP since 2010.

Recent Activity

upvoted a paper 16 days ago

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

upvoted a paper 23 days ago

UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities

authored a paper about 1 month ago

The Trickle-down Impact of Reward (In-)consistency on RLHF

View all activity

Organizations

upvoted a paper 16 days ago

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Paper • 2508.00414 • Published 19 days ago • 86

upvoted a paper 23 days ago

UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities

Paper • 2507.19766 • Published 25 days ago • 14

authored 8 papers about 1 month ago

The Trickle-down Impact of Reward (In-)consistency on RLHF

Paper • 2309.16155 • Published Sep 28, 2023

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Paper • 2407.00617 • Published Jun 30, 2024 • 7

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 42

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Paper • 2412.20735 • Published Dec 30, 2024 • 12

Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls

Paper • 2502.11183 • Published Feb 16

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

Paper • 2504.11456 • Published Apr 15 • 13

DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

Paper • 2505.23754 • Published May 29 • 16

Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving

Paper • 2507.06804 • Published Jul 7 • 15

upvoted a paper about 1 month ago

Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving

Paper • 2507.06804 • Published Jul 7 • 15

authored a paper 3 months ago

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Paper • 2505.10962 • Published May 16 • 8

upvoted a paper 3 months ago

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Paper • 2505.10962 • Published May 16 • 8

commented a paper 3 months ago

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Paper • 2505.10962 • Published May 16 • 8 •

upvoted a paper 5 months ago

Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published Mar 31 • 24

authored a paper 5 months ago

Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published Mar 31 • 24

authored a paper 7 months ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61

commented a paper 8 months ago

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Paper • 2412.20735 • Published Dec 30, 2024 • 12 •

upvoted a paper 8 months ago

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 42

authored a paper 10 months ago

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Paper • 2410.06508 • Published Oct 9, 2024 • 11

Linfeng Song

AI & ML interests

Recent Activity

Organizations

freesunshine0316's activity