Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2411.12580

Papers - Reasoning - Visualization - Pearson’s R

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Influence Functions - EK-FAC

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Pre-training

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 110
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 28
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 26

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

Paper • 2506.23423 • Published Jun 29 • 1
Stochastic Parameter Decomposition

Paper • 2506.20790 • Published Jun 25 • 1
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Paper • 2506.03093 • Published Jun 3 • 2

Papers - Training - Scaling - Influence Functions

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training

SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length

Paper • 2310.00576 • Published Oct 1, 2023 • 2
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Paper • 2305.13169 • Published May 22, 2023 • 3
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14, 2024 • 15

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 51
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 24
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

Papers - Reasoning - Visualization - Pearson’s R

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Scaling - Influence Functions

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training - Influence Functions - EK-FAC

Studying Large Language Model Generalization with Influence Functions

Paper • 2308.03296 • Published Aug 7, 2023 • 13
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19, 2024 • 2

Papers - Training

SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length

Paper • 2310.00576 • Published Oct 1, 2023 • 2
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Paper • 2305.13169 • Published May 22, 2023 • 3
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14, 2024 • 15

Papers - Pre-training

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 110
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 28
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 26

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 51
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 24
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 5

🔍 Interpretability & Analysis of LMs

Outstanding research in LM interpretability and evaluation, summarized

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

Paper • 2506.23423 • Published Jun 29 • 1
Stochastic Parameter Decomposition

Paper • 2506.20790 • Published Jun 25 • 1
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Paper • 2506.10920 • Published Jun 12 • 6
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Paper • 2506.03093 • Published Jun 3 • 2

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs