GGUF not supported?

by bdambrosio - opened Apr 8

Apr 8

vllm 0.8.3

raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.")
ValueError: GGUF model with architecture llama4 is not supported yet.

gghfez

Apr 8

Looks like they're still working on it (glancing at the vllm github)

llama.cpp (https://github.com/ggml-org/llama.cpp) works, and and presumably lmstudio (since they've released this quant here)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment