Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
thehandsomefrog4825 's Collections
Attention 🧐
Other research
Tool πŸ› οΈ
Top papers ⭐
Object detection πŸ”
LLM 🦜
VLM πŸ‘οΈπŸ‘οΈ
Object segmentation 🧩
Model πŸ–₯️
Reinforce learning πŸ”ƒ
Agent πŸ€–
RAG πŸ”„οΈ
BenchmarkπŸ“
GAN
Reasoning 🧠
Robotic πŸ€–πŸ”§
TTI βŒ¨οΈβž‘οΈπŸ–ΌοΈ
TTS βŒ¨οΈβž‘οΈπŸ—£οΈ
TTV πŸ“βž‘οΈπŸ“Ί
Generative 🎨

Reinforce learning πŸ”ƒ

updated Feb 9
Upvote
-

  • REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

    Paper β€’ 2501.03262 β€’ Published Jan 4 β€’ 99

  • Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

    Paper β€’ 2412.06531 β€’ Published Dec 9, 2024 β€’ 73

  • The Differences Between Direct Alignment Algorithms are a Blur

    Paper β€’ 2502.01237 β€’ Published Feb 3 β€’ 115

  • Process Reinforcement through Implicit Rewards

    Paper β€’ 2502.01456 β€’ Published Feb 3 β€’ 62
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs