-
Meta-Learning a Dynamical Language Model
Paper • 1803.10631 • Published • 1 -
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Paper • 2003.11963 • Published -
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model
Paper • 2212.04960 • Published • 1 -
Continuous Learning in a Hierarchical Multiscale Neural Network
Paper • 1805.05758 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2404.02078
-
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 26 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 34 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 73
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 31 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 59
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 6 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 27
-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 17 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 11 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 10
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 94 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 20 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 27 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 28
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 47 -
Locating and Editing Factual Associations in Mamba
Paper • 2404.03646 • Published • 3 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 114
-
Meta-Learning a Dynamical Language Model
Paper • 1803.10631 • Published • 1 -
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Paper • 2003.11963 • Published -
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model
Paper • 2212.04960 • Published • 1 -
Continuous Learning in a Hierarchical Multiscale Neural Network
Paper • 1805.05758 • Published • 2
-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 17 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 11 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 10
-
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 26 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 34 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 73
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 94 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 20 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 27 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 28
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 31 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 59
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 6 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 27
-
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 47 -
Locating and Editing Factual Associations in Mamba
Paper • 2404.03646 • Published • 3 -
Locating and Editing Factual Associations in GPT
Paper • 2202.05262 • Published • 1 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 114