Update README.md (#2)
Browse files- Update README.md (18a6b50947332289c28271336641e194be9298e2)
README.md
CHANGED
@@ -4,6 +4,14 @@ base_model:
|
|
4 |
- google/gemma-3-4b-it
|
5 |
---
|
6 |
|
7 |
-
This is just a reupload of the gemma-3-4b-it-q4_0 model. It can be installed by `ollama run hf.co/kreier/gemma3`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
|
9 |
-
The reason for this reupload is the conflicting login requirements for huggingface when downloading the original model and the use of huggingface-cli. I'm logged in in both and have access, but the combination does not work.
|
|
|
4 |
- google/gemma-3-4b-it
|
5 |
---
|
6 |
|
7 |
+
This is just a reupload of the **gemma-3-4b-it-qat-q4_0** model. It can be installed by `ollama run hf.co/kreier/gemma3`.
|
8 |
+
|
9 |
+
The reason for this reupload is the conflicting login requirements for huggingface when downloading the original model and the use of huggingface-cli. I'm logged in in both and have access, but the combination does not work.
|
10 |
+
|
11 |
+
It is different from the general gemma3 model in that it is a 4B **instruction-tuned** version of the Gemma 3 model in GGUF format using **Quantization Aware Training** (QAT). The GGUF corresponds to Q4_0 quantization.
|
12 |
+
|
13 |
+
See more details here: [https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf](https://huggingface.co/google/gemma-3-4b-it-qat-q4_0-gguf). It states:
|
14 |
+
|
15 |
+
> Thanks to QAT, the model is able to preserve similar quality as `bfloat16` while significantly reducing the memory requirements to load the model.
|
16 |
+
|
17 |
|
|