Update README.md
Browse files
README.md
CHANGED
@@ -16,10 +16,6 @@ Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.14">turb
|
|
16 |
|
17 |
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
|
18 |
|
19 |
-
Conversion was done using the default calibration dataset.
|
20 |
-
|
21 |
-
Default arguments used except when the bits per weight is above 6.0, at that point the lm_head layer is quantized at 8 bits per weight instead of the default 6.
|
22 |
-
|
23 |
Original model: https://huggingface.co/TechxGenus/starcoder2-15b-instruct
|
24 |
|
25 |
| Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
|
|
|
16 |
|
17 |
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
|
18 |
|
|
|
|
|
|
|
|
|
19 |
Original model: https://huggingface.co/TechxGenus/starcoder2-15b-instruct
|
20 |
|
21 |
| Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
|