Collections
Discover the best community collections!
Collections including paper arxiv:2403.03853
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 129 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 46 -
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models
Paper • 2404.01617 • Published • 8
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
Revisiting In-Context Learning with Long Context Language Models
Paper • 2412.16926 • Published • 33 -
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Paper • 2501.00712 • Published • 6
-
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 89 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 46 -
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 159 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66
-
Scaling Instruction-Finetuned Language Models
Paper • 2210.11416 • Published • 7 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 145 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Paper • 2402.09025 • Published • 9 -
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper • 2402.02834 • Published • 17 -
Algorithmic progress in language models
Paper • 2403.05812 • Published • 21
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Paper • 2301.00774 • Published • 3 -
The LLM Surgeon
Paper • 2312.17244 • Published • 9 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75
-
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
Tensor Product Attention Is All You Need
Paper • 2501.06425 • Published • 89 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 46 -
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 159 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 129 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 46 -
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models
Paper • 2404.01617 • Published • 8
-
Scaling Instruction-Finetuned Language Models
Paper • 2210.11416 • Published • 7 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 145 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Yi: Open Foundation Models by 01.AI
Paper • 2403.04652 • Published • 66
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
Revisiting In-Context Learning with Long Context Language Models
Paper • 2412.16926 • Published • 33 -
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
Paper • 2501.00712 • Published • 6
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Paper • 2402.09025 • Published • 9 -
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper • 2402.02834 • Published • 17 -
Algorithmic progress in language models
Paper • 2403.05812 • Published • 21
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 66 -
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Paper • 2301.00774 • Published • 3 -
The LLM Surgeon
Paper • 2312.17244 • Published • 9 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75