Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.13187

Can LLMs Follow Simple Rules?

Paper • 2311.04235 • Published Nov 6, 2023 • 14
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 82
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 89

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 16
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 36
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 10

Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 6
Ziya2: Data-centric Learning is All LLMs Need

Paper • 2311.03301 • Published Nov 6, 2023 • 20
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Paper • 2311.02103 • Published Nov 1, 2023 • 22
Extending Context Window of Large Language Models via Semantic Compression

Paper • 2312.09571 • Published Dec 15, 2023 • 15

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it!

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
Convergent Learning: Do different neural networks learn the same representations?

Paper • 1511.07543 • Published Nov 24, 2015 • 2
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

Paper • 1909.11299 • Published Sep 25, 2019 • 2
Model Fusion via Optimal Transport

Paper • 1910.05653 • Published Oct 12, 2019 • 1

Can LLMs Follow Simple Rules?

Paper • 2311.04235 • Published Nov 6, 2023 • 14
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 82
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 89

Ultra-Long Sequence Distributed Transformer

Paper • 2311.02382 • Published Nov 4, 2023 • 6
Ziya2: Data-centric Learning is All LLMs Need

Paper • 2311.03301 • Published Nov 6, 2023 • 20
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Paper • 2311.02103 • Published Nov 1, 2023 • 22
Extending Context Window of Large Language Models via Semantic Compression

Paper • 2312.09571 • Published Dec 15, 2023 • 15

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 16
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 36
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 10

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it!

Qualitatively characterizing neural network optimization problems

Paper • 1412.6544 • Published Dec 19, 2014 • 4
Convergent Learning: Do different neural networks learn the same representations?

Paper • 1511.07543 • Published Nov 24, 2015 • 2
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

Paper • 1909.11299 • Published Sep 25, 2019 • 2
Model Fusion via Optimal Transport

Paper • 1910.05653 • Published Oct 12, 2019 • 1

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs