GGUF not supported?

#2
by bdambrosio - opened

vllm 0.8.3

raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.")
ValueError: GGUF model with architecture llama4 is not supported yet.

Looks like they're still working on it (glancing at the vllm github)

llama.cpp (https://github.com/ggml-org/llama.cpp) works, and and presumably lmstudio (since they've released this quant here)

Sign up or log in to comment