GGUF not supported?
#2
by
bdambrosio
- opened
vllm 0.8.3
raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.")
ValueError: GGUF model with architecture llama4 is not supported yet.
Looks like they're still working on it (glancing at the vllm github)
llama.cpp (https://github.com/ggml-org/llama.cpp) works, and and presumably lmstudio (since they've released this quant here)