Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hao Sun's picture
7 18

Hao Sun

Holarissun
Ray2333's profile picture
·
https://holarissun.github.io/
  • HolarisSun
  • holarissun

AI & ML interests

PhD@Uni.Cambridge. Deep RL, RL x LLM, RLHF.

Organizations

None yet

Papers 3

arxiv:2310.07747
arxiv:2310.06147
arxiv:2207.05161

models 356

Holarissun/SFT_gemma2b_hh-rlhf-helpful-gpt4_lr5e-06_epoch2-subset-1

Updated Jun 17, 2024 • 2

Holarissun/SFT_gemma2b_hh-rlhf-helpful_lr5e-06_epoch2-subset-1

Updated Jun 17, 2024 • 2

Holarissun/REPROD_dpo_helpfulhelpful_gpt4_subset-1_modelgemma7b_maxsteps10000_bz8_lr5e-06

Updated May 29, 2024 • 2

Holarissun/REPROD_dpo_harmlessharmless_gpt4_subset-1_modelgemma7b_maxsteps10000_bz8_lr5e-06

Updated May 29, 2024 • 1

Holarissun/REPROD_dpo_helpfulhelpful_human_subset-1_modelgemma7b_maxsteps10000_bz8_lr5e-06

Updated May 29, 2024 • 3

Holarissun/REPROD_dpo_harmlessharmless_human_subset-1_modelgemma7b_maxsteps10000_bz8_lr5e-06

Updated May 29, 2024 • 3

Holarissun/REPROD_dpo_helpfulhelpful_human_subset-1_modelgemma2b_maxsteps10000_bz8_lr5e-06

Updated May 28, 2024 • 1

Holarissun/REPROD_dpo_harmlessharmless_human_subset-1_modelgemma2b_maxsteps10000_bz8_lr5e-06

Updated May 28, 2024

Holarissun/REPROD_dpo_helpfulhelpful_human_subset-1_modelgemma2b_maxsteps10000_bz8_lr5e-05

Updated May 25, 2024 • 2

Holarissun/REPROD_dpo_harmlessharmless_human_subset-1_modelgemma2b_maxsteps6000_bz8_lr5e-05

Updated May 24, 2024

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs