XAttention: Block Sparse Attention with Antidiagonal Scoring Paper • 2503.16428 • Published Mar 20 • 14
🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated 18 days ago • 148
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Paper • 2410.10819 • Published Oct 14, 2024 • 7
Efficient Streaming Language Models with Attention Sinks Paper • 2309.17453 • Published Sep 29, 2023 • 13