XuQixin's picture

2 3

XuQixin

Racktic

·

Racktic

AI & ML interests

NLP, mutimodel

Recent Activity

upvoted a paper 9 days ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

authored a paper 4 months ago

Process Reinforcement through Implicit Rewards

upvoted a paper 4 months ago

Process Reinforcement through Implicit Rewards

View all activity

Organizations

None yet

Papers 1

arxiv:2502.01456

models 1

Racktic/qwen_dedup_top8_dpo_old_math_new_math_syn_olympiads

Updated Dec 16, 2024

datasets 0

None public yet