metadata

library_name: transformers
tags:
  - object-detection
  - license-plate
license: apache-2.0
datasets:
  - ariG23498/license-detection-paligemma
base_model:
  - google/gemma-3-4b-pt
pipeline_tag: object-detection

Gemma 3 4B Fine-Tuned for Object Detection

This model is a fine-tuned version of Gemma 3 4B for license plate object detection.

Detected License Plates (Sample 1)	Detected License Plates (Sample 2)

Model Details

Model Description

This model aims to prove that VLMs NOT previously trained for object detection and without previous knowledge of location tokens (<locXXXX>) can still be fine tuned for object detection out of the box. This is an experimental model.

Developed by: Aritra Roy Gosthipaty and Sergio Paniego
Finetuned from model: gemma-3-4b-pt

Model Sources

Uses

Follow these steps to configure, train, and run predictions (using the code repository):

Configuration (config.py): All major parameters are centralized here. Before running any script, review and adjust these settings as needed.
Training (train.py): This script handles the fine-tuning process.
Running inference (infer.py): Run this to visualize object detection.

Citation

If you use our work, please cite us:

@misc{gosthipaty_gemma3_object_detection_2025,
  author = {Aritra Roy Gosthipaty and Sergio Paniego},
  title = {Fine-tuning Gemma 3 for Object Detection},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/ariG23498/gemma3-object-detection.git}}
}