Collections
Discover the best community collections!
Collections including paper arxiv:2402.15000
-
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 25 -
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 55 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 77 -
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Paper • 2403.14468 • Published • 27
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 30 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 30 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 25
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 38
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 30 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 30 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 25
-
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 25 -
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 55 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 77 -
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks
Paper • 2403.14468 • Published • 27
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 75 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 38
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18