Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
BHbean
's Collections
Survey
MoE LLM Systems
LLM resource-constrained Inference
New LLM Algorithms
LLM Internal Mechanism
Prompt Engineering
Speculative Decoding
parallelism
KV Cache Compression
LLM reasoning systems
New LLM Algorithms
updated
Apr 4
Upvote
-
Multi-Token Attention
Paper
•
2504.00927
•
Published
Apr 1
•
52
Upvote
-
Share collection
View history
Collection guide
Browse collections