Add files using upload-large-folder tool

Files changed (4) hide show

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ tags:
 - pruna-ai
 ---
-# Model Card for PrunaAI/test-tiny-random-llama4-smashed
 This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.
@@ -16,7 +16,7 @@ First things first, you need to install the pruna library:
 pip install pruna
 ```
-You can [use the transformers library to load the model](https://huggingface.co/PrunaAI/test-tiny-random-llama4-smashed?library=transformers) but this might not include all optimizations by default.
 To ensure that all optimizations are applied, use the pruna library to load the model using the following code:
@@ -24,7 +24,7 @@ To ensure that all optimizations are applied, use the pruna library to load the
 from pruna import PrunaModel
 loaded_model = PrunaModel.from_hub(
-    "PrunaAI/test-tiny-random-llama4-smashed"
 )
 ```
@@ -39,6 +39,7 @@ The compression configuration of the model is stored in the `smash_config.json`
     "batcher": null,
     "cacher": null,
     "compiler": null,
     "pruner": null,
     "quantizer": null,
     "batch_size": 1,
@@ -48,6 +49,7 @@ The compression configuration of the model is stored in the `smash_config.json`
         "transformers"
     ],
     "reapply_after_load": {
         "pruner": null,
         "quantizer": null,
         "cacher": null,

 - pruna-ai
 ---
+# Model Card for PrunaAI/test-load-tiny-random-llama4-smashed
 This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.
 pip install pruna
 ```
+You can [use the transformers library to load the model](https://huggingface.co/PrunaAI/test-load-tiny-random-llama4-smashed?library=transformers) but this might not include all optimizations by default.
 To ensure that all optimizations are applied, use the pruna library to load the model using the following code:
 from pruna import PrunaModel
 loaded_model = PrunaModel.from_hub(
+    "PrunaAI/test-load-tiny-random-llama4-smashed"
 )
 ```
     "batcher": null,
     "cacher": null,
     "compiler": null,
+    "factorizer": null,
     "pruner": null,
     "quantizer": null,
     "batch_size": 1,
         "transformers"
     ],
     "reapply_after_load": {
+        "factorizer": null,
         "pruner": null,
         "quantizer": null,
         "cacher": null,

config.json CHANGED Viewed

@@ -59,7 +59,7 @@
   "router_jitter_noise": 0.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
-  "transformers_version": "4.51.3",
   "use_cache": true,
   "use_qk_norm": true,
   "vocab_size": 202048

   "router_jitter_noise": 0.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
   "use_cache": true,
   "use_qk_norm": true,
   "vocab_size": 202048

generation_config.json CHANGED Viewed

@@ -7,5 +7,5 @@
     200008
   ],
   "pad_token_id": 200018,
-  "transformers_version": "4.51.3"
 }

     200008
   ],
   "pad_token_id": 200018,
+  "transformers_version": "4.52.4"
 }

smash_config.json CHANGED Viewed

@@ -2,6 +2,7 @@
     "batcher": null,
     "cacher": null,
     "compiler": null,
     "pruner": null,
     "quantizer": null,
     "batch_size": 1,
@@ -11,6 +12,7 @@
         "transformers"
     ],
     "reapply_after_load": {
         "pruner": null,
         "quantizer": null,
         "cacher": null,

     "batcher": null,
     "cacher": null,
     "compiler": null,
+    "factorizer": null,
     "pruner": null,
     "quantizer": null,
     "batch_size": 1,
         "transformers"
     ],
     "reapply_after_load": {
+        "factorizer": null,
         "pruner": null,
         "quantizer": null,
         "cacher": null,