--- library_name: transformers tags: - object-detection - license-plate license: apache-2.0 datasets: - ariG23498/license-detection-paligemma base_model: - google/gemma-3-4b-pt pipeline_tag: object-detection --- # Gemma 3 4B Fine-Tuned for Object Detection This model is a fine-tuned version of Gemma 3 4B for license plate object detection. | Detected License Plates (Sample 1) | Detected License Plates (Sample 2) | | :--------------------------------: | :--------------------------------: | | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61929226ded356549e20c5da/1BgYT_F9V22ULMJ4yYEdn.png) | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61929226ded356549e20c5da/jpZjRKEfOHu5qqiYKVXp-.png) | | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61929226ded356549e20c5da/-SrbuYyr0HvfY8vsIr_-i.png) | ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61929226ded356549e20c5da/6KTp4BanHbbf3NopRn4eX.png)| ## Model Details ### Model Description This model aims to prove that VLMs **NOT previously** trained for object detection and **without previous knowledge** of location tokens (``) can still be fine tuned for object detection out of the box. This is an experimental model. - **Developed by:** [Aritra Roy Gosthipaty](https://huggingface.co/ariG23498) and [Sergio Paniego](https://huggingface.co/sergiopaniego) - **Finetuned from model:** [gemma-3-4b-pt](https://huggingface.co/google/gemma-3-4b-pt) ### Model Sources - [**Repository:**](https://github.com/ariG23498/gemma3-object-detection) - [**HF Space:**](https://huggingface.co/spaces/ariG23498/gemma3-license-plate-detection) - [**Collection:**](https://huggingface.co/collections/ariG23498/gemma-3-object-detection-682469cb72084d8ab22460b3) - [**Dataset:**](https://huggingface.co/datasets/ariG23498/license-detection-paligemma) ## Uses Follow these steps to configure, train, and run predictions (using the code repository): 1. Configuration (`config.py`): All major parameters are centralized here. Before running any script, review and adjust these settings as needed. 2. Training (`train.py`): This script handles the fine-tuning process. 3. Running inference (`infer.py`): Run this to visualize object detection. ## Citation If you use our work, please cite us: ``` @misc{gosthipaty_gemma3_object_detection_2025, author = {Aritra Roy Gosthipaty and Sergio Paniego}, title = {Fine-tuning Gemma 3 for Object Detection}, year = {2025}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/ariG23498/gemma3-object-detection.git}} } ```