-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models
Paper • 2410.18505 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2410.16256
-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 43 -
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Paper • 2402.10524 • Published • 24 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Code Llama: Open Foundation Models for Code
Paper • 2308.12950 • Published • 27
-
Instruction Following without Instruction Tuning
Paper • 2409.14254 • Published • 31 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models
Paper • 2410.18505 • Published • 11
-
Instruction Following without Instruction Tuning
Paper • 2409.14254 • Published • 31 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Paper • 2410.18558 • Published • 19
-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 43 -
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Paper • 2402.10524 • Published • 24 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
Code Llama: Open Foundation Models for Code
Paper • 2308.12950 • Published • 27
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49