jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_no_sys_msg_filtered Text Generation • Updated Feb 13 • 7
jiminmun/llama-3.2-3b_reward_model_data_mix_lr9e-6_no_sys_msg_filtered Text Classification • Updated Feb 12 • 2
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_avg_w_sys_msg_unfiltered Text Classification • Updated Feb 10 • 1
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_no_sys_msg_unfiltered Text Classification • Updated Feb 10 • 2
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_w_sys_msg_unfiltered Text Classification • Updated Feb 10 • 4
jiminmun/llama-3.2-3b_reward_model_clarity_lr9e-6_no_sys_msg_filtered Text Classification • Updated Feb 9 • 1
jiminmun/llama-3.2-3b_reward_model_focus_lr9e-6_no_sys_msg_filtered Text Classification • Updated Feb 9 • 1
jiminmun/llama-3.2-3b_reward_model_relevance_lr9e-6_no_sys_msg_filtered Text Classification • Updated Feb 9 • 2
jiminmun/llama-3.2-3b_reward_model_avoidbias_lr9e-6_no_sys_msg_filtered Text Classification • Updated Feb 9 • 1