metadata
library_name: transformers
tags:
- object-detection
- license-plate
license: apache-2.0
datasets:
- ariG23498/license-detection-paligemma
base_model:
- google/gemma-3-4b-pt
pipeline_tag: object-detection
Gemma 3 4B Fine-Tuned for Object Detection
This model is a fine-tuned version of Gemma 3 4B for license plate object detection.
Model Details
Model Description
This model aims to prove that VLMs NOT previously trained for object detection and without previous knowledge of location tokens (<locXXXX>
) can still be fine tuned for object detection out of the box. This is an experimental model.
- Developed by: Aritra Roy Gosthipaty and Sergio Paniego
- Finetuned from model: gemma-3-4b-pt
Model Sources
Uses
Follow these steps to configure, train, and run predictions (using the code repository):
- Configuration (
config.py
): All major parameters are centralized here. Before running any script, review and adjust these settings as needed. - Training (
train.py
): This script handles the fine-tuning process. - Running inference (
infer.py
): Run this to visualize object detection.
Citation
If you use our work, please cite us:
@misc{gosthipaty_gemma3_object_detection_2025,
author = {Aritra Roy Gosthipaty and Sergio Paniego},
title = {Fine-tuning Gemma 3 for Object Detection},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ariG23498/gemma3-object-detection.git}}
}