Validation of - Performance Optimization and Model Compression

#83

by shubh2ds - opened Apr 5

base: refs/heads/main

←

from: refs/pr/83

Discussion Files changed

+25

-0

Files changed (1) hide show

Performance-Optimization-and-Model-Compression +25 -0

Performance-Optimization-and-Model-Compression ADDED Viewed

	@@ -0,0 +1,25 @@

+I used the bert-base-uncased to evaluate the model's power towards the Model distillation. The model( bert-base-uncased ) performed exceptionally in terms of performance(Accuracy, speed and size) while compressing the model.
+## Observations from the experiment:
+  # Base Model -F32: 110M params having Accuracy - {'Accuracy': 0.893}
+  # Distilled Model(Reduced to half): 52.8M params having Accuracy - {'Accuracy': 0.924}
+  # Quantized Model(Reduced to 2.8 times): 38.8M params having Accuracy - {'Accuracy': 0.922}
+Details are below:
+I trained two classifier models, one for the teacher(size and accuracy same as base) and the other for the student(model size reduced to half and accuracy increased than base model by 3%).
+# Teacher (validation)  -Accuracy: 0.8933, Precision: 0.9078, Recall: 0.8756, F1 Score: 0.8914
+# Student (validation)  -Accuracy: 0.9244, Precision: 0.9704, Recall: 0.8756, F1 Score: 0.9206
+Further, I quantized the model to 4 bits and got the magical performance:
+# Pre-quantization Perf  -Accuracy: 0.9244, Precision: 0.9704, Recall: 0.8756, F1 Score: 0.9206
+# Post-quantization Perf -Accuracy: 0.9222, Precision: 0.9703, Recall: 0.8711, F1 Score: 0.9180
+Conclusion: Model parameters were reduced to 2.8 times while achieving an accuracy increase of 3% over the original base model.
+GPU: Tesla T4
+Dataset: shubh2ds/data-phishing-site-clf
+Consumption:  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65c7e4b88835934825c66d70/NXOi4aCzkh6_IYy8nKa1i.png)
+Final Model: https://huggingface.co/shubh2ds/bert-base-uncased-phishing-classifier_student_4bit
+https://wandb.ai/shubh2ds/huggingface/runs/yt0pcw58/workspace?nw=nwusershubh2ds