Add link to paper (#1)

- Add link to paper (4848ceb47ad401827547eb58d409745f66614b6f)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md CHANGED Viewed

@@ -3,13 +3,15 @@ license: mit
 ---
 # Sparge-attention model zoo
 Welcome to Sparge-attention model zoo, this repo contains list of hyperparameters pre-tuned for branch of models.
 ## Naming of ckpt
 The tuned ckpt is often named by following format:`${moddel name or type}_${l1}_${pv_l1}.pt`, in some cases the pv_l1 will be omitted when not choose to tune pv.
 The larger l1 and pv_l1 make model more sparse, but may sacrifice output quality.
 ## Overview
 | model name | tuned ckpt dir |

 ---
 # Sparge-attention model zoo
 Welcome to Sparge-attention model zoo, this repo contains list of hyperparameters pre-tuned for branch of models.
+It was presented in the paper [SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference](https://huggingface.co/papers/2502.18137).
 ## Naming of ckpt
 The tuned ckpt is often named by following format:`${moddel name or type}_${l1}_${pv_l1}.pt`, in some cases the pv_l1 will be omitted when not choose to tune pv.
 The larger l1 and pv_l1 make model more sparse, but may sacrifice output quality.
 ## Overview
 | model name | tuned ckpt dir |