Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Andrew Zhao's picture
3 18 22

Andrew Zhao

andrewzh
dami1996's profile picture go4broke's profile picture YeMinNaing's profile picture
·
https://andrewzh112.github.io/
  • _AndrewZhao
  • Andrewzh112
  • andrewqzhao

AI & ML interests

Reinforcement Learning, Agents

Recent Activity

upvoted a paper 4 days ago
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
upvoted a paper 17 days ago
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
authored a paper 23 days ago
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
View all activity

Organizations

None yet

Collections 1

Absolute Zero Reasoner
  • andrewzh/Absolute_Zero_Reasoner-Coder-7b

    Updated May 5 • 2.25k • 16
  • andrewzh/Absolute_Zero_Reasoner-Coder-14b

    Updated May 6 • 756 • 25
  • andrewzh/Absolute_Zero_Reasoner-Coder-3b

    Updated May 6 • 2.18k • 9
  • andrewzh2/Absolute_Zero_Reasoner-Base-14b

    Updated May 6 • 71 • 9

Papers 11

arxiv:2505.03335
arxiv:2504.13837
arxiv:2410.16392
arxiv:2407.08770

models 3

andrewzh/Absolute_Zero_Reasoner-Coder-14b

Updated May 6 • 756 • 25

andrewzh/Absolute_Zero_Reasoner-Coder-3b

Updated May 6 • 2.18k • 9

andrewzh/Absolute_Zero_Reasoner-Coder-7b

Updated May 5 • 2.25k • 16

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs