Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models, including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained and instruction-tuned). • 37 items • Updated 17 days ago • 38
view article Article Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance By tiiuae and 5 others • 17 days ago • 26
view article Article Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models. By tiiuae and 9 others • 22 days ago • 33
Falcon Edge series Collection A series of powerful, universal and fine-tunable small Language Models • 7 items • Updated 17 days ago • 22
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1 • 42
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 17 days ago • 86
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7, 2024 • 36
view article Article Welcome FalconMamba: The first strong attention-free 7B model By JingweiZuo and 5 others • Aug 12, 2024 • 112
FalconMamba 7B Collection This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 15 items • Updated 17 days ago • 34
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning Paper • 2303.02861 • Published Mar 6, 2023 • 2
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model Paper • 2406.04904 • Published Jun 7, 2024 • 9
AQLM+PV Collection Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 26 items • Updated Feb 28 • 21
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28, 2024 • 12
view article Article Overview of natively supported quantization schemes in 🤗 Transformers By ybelkada and 4 others • Sep 12, 2023 • 12
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 776
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study Paper • 2404.10719 • Published Apr 16, 2024 • 6