Jade's picture

Jade

euclaise

·

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

google/gemma-3n-E4B-it-litert-preview

liked a dataset 7 days ago

Jiahao004/DeepTheorem

upvoted a paper 7 days ago

How new data permeates LLM knowledge and how to dilute it

View all activity

Organizations

euclaise's activity

liked a model 5 days ago

google/gemma-3n-E4B-it-litert-preview

Image-Text-to-Text • Updated 12 days ago • 973

liked a dataset 7 days ago

Jiahao004/DeepTheorem

Viewer • Updated about 24 hours ago • 121k • 1.19k • 16

upvoted 5 papers 7 days ago

How new data permeates LLM knowledge and how to dilute it

Paper • 2504.09522 • Published Apr 13 • 8

BLEUBERI: BLEU is a surprisingly effective reward for instruction following

Paper • 2505.11080 • Published 22 days ago • 5

Text Generation Beyond Discrete Token Sampling

Paper • 2505.14827 • Published 18 days ago • 10

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Paper • 2505.15778 • Published 17 days ago • 15

Hybrid Latent Reasoning via Reinforcement Learning

Paper • 2505.18454 • Published 14 days ago • 5

upvoted a paper 8 days ago

HoPE: Hybrid of Position Embedding for Length Generalization in Vision-Language Models

Paper • 2505.20444 • Published 12 days ago • 3

liked a dataset 8 days ago

MiniMaxAI/SynLogic

Viewer • Updated 2 days ago • 49.3k • 906 • 78

upvoted a paper 8 days ago

ATLAS: Learning to Optimally Memorize the Context at Test Time

Paper • 2505.23735 • Published 9 days ago • 22

upvoted 3 papers 9 days ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published 11 days ago • 26

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Paper • 2505.21189 • Published 11 days ago • 60

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published 12 days ago • 64

liked a model 13 days ago

marin-community/marin-8b-base

Text Generation • Updated 19 days ago • 327 • 7

upvoted 4 papers 28 days ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 85

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 112

Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22 • 55

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Paper • 2505.03005 • Published May 5 • 31

upvoted 2 papers 29 days ago

AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

Paper • 2504.21659 • Published Apr 30 • 12

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

Paper • 2504.20605 • Published Apr 29 • 13