4 9

Jason Weston

spermwhale

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Self-Challenging Language Model Agents

commented on a paper 2 days ago

Self-Challenging Language Model Agents

authored a paper 21 days ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

View all activity

Organizations

None yet

spermwhale's activity

upvoted a paper 2 days ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published 4 days ago • 8

commented a paper 2 days ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published 4 days ago • 8 •

authored a paper 21 days ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published 22 days ago • 22

authored a paper 2 months ago

Multi-Token Attention

Paper • 2504.00927 • Published Apr 1 • 52

authored a paper 3 months ago

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published Mar 19 • 11

authored a paper 4 months ago

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18 • 15

authored 2 papers 6 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 103

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 86

authored a paper 7 months ago

Adaptive Decoding via Latent Preference Optimization

Paper • 2411.09661 • Published Nov 14, 2024 • 10

authored a paper 8 months ago

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published Oct 14, 2024 • 21

authored a paper 9 months ago

Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

Paper • 2409.08239 • Published Sep 12, 2024 • 21

authored 2 papers 10 months ago

Better Alignment with Instruction Back-and-Forth Translation

Paper • 2408.04614 • Published Aug 8, 2024 • 16

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 30

upvoted a paper 10 months ago

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Paper • 2407.19594 • Published Jul 28, 2024 • 21

authored 3 papers about 1 year ago

commented 3 papers over 1 year ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 149 •

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 149 •

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 149 •