--- tags: - uqff - mistral.rs base_model: meta-llama/Llama-4-Scout-17B-16E-Instruct base_model_relation: quantized --- # `meta-llama/Llama-4-Scout-17B-16E-Instruct`, UQFF quantization Run with [mistral.rs](https://github.com/EricLBuehler/mistral.rs). Documentation: [UQFF docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/UQFF.md). 1) **Flexible** 🌀: Multiple quantization formats in *one* file format with *one* framework to run them all. 2) **Reliable** 🔒: Compatibility ensured with *embedded* and *checked* semantic versioning information from day 1. 3) **Easy** 🤗: Download UQFF models *easily* and *quickly* from Hugging Face, or use a local file. 3) **Customizable** 🛠️: Make and publish your own UQFF files in minutes. ## Examples Note: If you are using an Apple Silicon device (on Metal), prefer using an 🔥 AFQ quantization for best performance! |Quantization type(s)|Example| |--|--| |Q4K|`./mistralrs-server -i vision-plain -m EricB/Llama-4-Scout-17B-16E-Instruct-UQFF -a llama4 --from-uqff "llama4-scout-instruct-q4k-0.uqff;llama4-scout-instruct-q4k-1.uqff;llama4-scout-instruct-q4k-2.uqff;llama4-scout-instruct-q4k-3.uqff;llama4-scout-instruct-q4k-4.uqff;llama4-scout-instruct-q4k-5.uqff;llama4-scout-instruct-q4k-6.uqff"`| |AFQ4 |Coming soon!|