thank you for GGUF!

#1
by jacek2024 - opened

It’s really nice to have GGUF available from IBM.

IBM Granite org

You're welcome! We've heard the signal on the confusion for GGUFs, so we'll now be co-locating official GGUFs here in the ibm-granite org under this collection.

Hey how do you enable thinking using Ollama, LMStudio, etc?

IBM Granite org

Hi @RougueSpud ! There are several ways to enable thinking in Ollama and LM Studio, but only one of them works today:

  1. Using these GGUFs which don't contain the official Ollama chat template, you would need to replicate the logic in the official chat template on the client side to enable thinking (adding the requisite system prompt section here)
  2. If you are using the official models from Ollama, they come with a chat template that supports enabling thinking with a special element to the messages field when making an API call with the following format: {"role": "control", "content": "thinking"}. This, unfortunately, is not accessible through the CLI
  3. Ollama just introduced a new thinking capability in in 0.9.0. This will require some special templating in the chat template to get it to work correctly for Granite. I'm actively working on this for the official Granite models, but it isn't done yet.

At the moment, there isn't a systematic way to use thinking through LM Studio without doing client-side system prompt construction (option [1] above).

IBM Granite org

I've now got updated template versions for Ollama that allow the built-in "think" capability to work. They're pushed to my personal staging account (gabegoodhart/granite3.2, gabegoodhart/granite3.3) while we work to get them on the official library. You can try it out as follows:

ollama pull gabegoodhart/granite3.3
ollama run gabegoodhart/granite3.3 --think "What's the best way to visit all of my clients in my sales region?"

Sign up or log in to comment