davidberenstein1957 commited on
Commit
7592876
·
verified ·
1 Parent(s): 3b6084a

Add files using upload-large-folder tool

Browse files
Files changed (4) hide show
  1. README.md +5 -3
  2. config.json +1 -1
  3. generation_config.json +1 -1
  4. smash_config.json +2 -0
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - pruna-ai
5
  ---
6
 
7
- # Model Card for PrunaAI/test-tiny-random-llama4-smashed
8
 
9
  This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.
10
 
@@ -16,7 +16,7 @@ First things first, you need to install the pruna library:
16
  pip install pruna
17
  ```
18
 
19
- You can [use the transformers library to load the model](https://huggingface.co/PrunaAI/test-tiny-random-llama4-smashed?library=transformers) but this might not include all optimizations by default.
20
 
21
  To ensure that all optimizations are applied, use the pruna library to load the model using the following code:
22
 
@@ -24,7 +24,7 @@ To ensure that all optimizations are applied, use the pruna library to load the
24
  from pruna import PrunaModel
25
 
26
  loaded_model = PrunaModel.from_hub(
27
- "PrunaAI/test-tiny-random-llama4-smashed"
28
  )
29
  ```
30
 
@@ -39,6 +39,7 @@ The compression configuration of the model is stored in the `smash_config.json`
39
  "batcher": null,
40
  "cacher": null,
41
  "compiler": null,
 
42
  "pruner": null,
43
  "quantizer": null,
44
  "batch_size": 1,
@@ -48,6 +49,7 @@ The compression configuration of the model is stored in the `smash_config.json`
48
  "transformers"
49
  ],
50
  "reapply_after_load": {
 
51
  "pruner": null,
52
  "quantizer": null,
53
  "cacher": null,
 
4
  - pruna-ai
5
  ---
6
 
7
+ # Model Card for PrunaAI/test-load-tiny-random-llama4-smashed
8
 
9
  This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.
10
 
 
16
  pip install pruna
17
  ```
18
 
19
+ You can [use the transformers library to load the model](https://huggingface.co/PrunaAI/test-load-tiny-random-llama4-smashed?library=transformers) but this might not include all optimizations by default.
20
 
21
  To ensure that all optimizations are applied, use the pruna library to load the model using the following code:
22
 
 
24
  from pruna import PrunaModel
25
 
26
  loaded_model = PrunaModel.from_hub(
27
+ "PrunaAI/test-load-tiny-random-llama4-smashed"
28
  )
29
  ```
30
 
 
39
  "batcher": null,
40
  "cacher": null,
41
  "compiler": null,
42
+ "factorizer": null,
43
  "pruner": null,
44
  "quantizer": null,
45
  "batch_size": 1,
 
49
  "transformers"
50
  ],
51
  "reapply_after_load": {
52
+ "factorizer": null,
53
  "pruner": null,
54
  "quantizer": null,
55
  "cacher": null,
config.json CHANGED
@@ -59,7 +59,7 @@
59
  "router_jitter_noise": 0.0,
60
  "tie_word_embeddings": false,
61
  "torch_dtype": "float32",
62
- "transformers_version": "4.51.3",
63
  "use_cache": true,
64
  "use_qk_norm": true,
65
  "vocab_size": 202048
 
59
  "router_jitter_noise": 0.0,
60
  "tie_word_embeddings": false,
61
  "torch_dtype": "float32",
62
+ "transformers_version": "4.52.4",
63
  "use_cache": true,
64
  "use_qk_norm": true,
65
  "vocab_size": 202048
generation_config.json CHANGED
@@ -7,5 +7,5 @@
7
  200008
8
  ],
9
  "pad_token_id": 200018,
10
- "transformers_version": "4.51.3"
11
  }
 
7
  200008
8
  ],
9
  "pad_token_id": 200018,
10
+ "transformers_version": "4.52.4"
11
  }
smash_config.json CHANGED
@@ -2,6 +2,7 @@
2
  "batcher": null,
3
  "cacher": null,
4
  "compiler": null,
 
5
  "pruner": null,
6
  "quantizer": null,
7
  "batch_size": 1,
@@ -11,6 +12,7 @@
11
  "transformers"
12
  ],
13
  "reapply_after_load": {
 
14
  "pruner": null,
15
  "quantizer": null,
16
  "cacher": null,
 
2
  "batcher": null,
3
  "cacher": null,
4
  "compiler": null,
5
+ "factorizer": null,
6
  "pruner": null,
7
  "quantizer": null,
8
  "batch_size": 1,
 
12
  "transformers"
13
  ],
14
  "reapply_after_load": {
15
+ "factorizer": null,
16
  "pruner": null,
17
  "quantizer": null,
18
  "cacher": null,