-
Unlocking Continual Learning Abilities in Language Models
Paper • 2406.17245 • Published • 31 -
A Closer Look into Mixture-of-Experts in Large Language Models
Paper • 2406.18219 • Published • 16 -
Symbolic Learning Enables Self-Evolving Agents
Paper • 2406.18532 • Published • 12 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 43
Collections
Discover the best community collections!
Collections including paper arxiv:2407.00320
-
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Adaptable Logical Control for Large Language Models
Paper • 2406.13892 • Published • 1 -
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation
Paper • 2406.19215 • Published • 32 -
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Paper • 2405.14831 • Published • 4
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 20 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 66 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 49 -
Transformers meet Neural Algorithmic Reasoners
Paper • 2406.09308 • Published • 45
-
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 51 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 56 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 28 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 35
-
Eliminating Position Bias of Language Models: A Mechanistic Approach
Paper • 2407.01100 • Published • 9 -
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Paper • 2407.01920 • Published • 17 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40
-
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 32 -
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
Paper • 2406.12824 • Published • 21 -
Tokenization Falling Short: The Curse of Tokenization
Paper • 2406.11687 • Published • 16 -
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Paper • 2406.11817 • Published • 13
-
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 46 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40 -
Cut Your Losses in Large-Vocabulary Language Models
Paper • 2411.09009 • Published • 50 -
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper • 2411.09595 • Published • 78
-
Training Verifiers to Solve Math Word Problems
Paper • 2110.14168 • Published • 4 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 18 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40 -
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper • 2309.03883 • Published • 35
-
Unlocking Continual Learning Abilities in Language Models
Paper • 2406.17245 • Published • 31 -
A Closer Look into Mixture-of-Experts in Large Language Models
Paper • 2406.18219 • Published • 16 -
Symbolic Learning Enables Self-Evolving Agents
Paper • 2406.18532 • Published • 12 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 43
-
Eliminating Position Bias of Language Models: A Mechanistic Approach
Paper • 2407.01100 • Published • 9 -
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Paper • 2407.01920 • Published • 17 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40
-
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Adaptable Logical Control for Large Language Models
Paper • 2406.13892 • Published • 1 -
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation
Paper • 2406.19215 • Published • 32 -
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Paper • 2405.14831 • Published • 4
-
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 32 -
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
Paper • 2406.12824 • Published • 21 -
Tokenization Falling Short: The Curse of Tokenization
Paper • 2406.11687 • Published • 16 -
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Paper • 2406.11817 • Published • 13
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 20 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 66 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 49 -
Transformers meet Neural Algorithmic Reasoners
Paper • 2406.09308 • Published • 45
-
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 46 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40 -
Cut Your Losses in Large-Vocabulary Language Models
Paper • 2411.09009 • Published • 50 -
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper • 2411.09595 • Published • 78
-
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 51 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 56 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 28 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 35
-
Training Verifiers to Solve Math Word Problems
Paper • 2110.14168 • Published • 4 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 18 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 40 -
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Paper • 2309.03883 • Published • 35