|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- VinayHajare/Marathi-Sign-Language |
|
language: |
|
- en |
|
base_model: |
|
- google/siglip2-base-patch16-224 |
|
pipeline_tag: image-classification |
|
library_name: transformers |
|
tags: |
|
- Marathi-Sign-Language-Detection |
|
- SigLIP2 |
|
- 93M |
|
--- |
|
|
|
 |
|
|
|
# Marathi-Sign-Language-Detection |
|
|
|
> Marathi-Sign-Language-Detection is a vision-language model fine-tuned from google/siglip2-base-patch16-224 for multi-class image classification. It is trained to recognize Marathi sign language hand gestures and map them to corresponding Devanagari characters using the SiglipForImageClassification architecture. |
|
|
|
```py |
|
Classification Report: |
|
precision recall f1-score support |
|
|
|
अ 0.9881 0.9911 0.9896 1009 |
|
आ 0.9926 0.9237 0.9569 1022 |
|
इ 0.8132 0.9609 0.8809 1101 |
|
ई 0.9424 0.8894 0.9151 1103 |
|
उ 0.9477 0.9073 0.9271 1198 |
|
ऊ 0.9436 1.0000 0.9710 1071 |
|
ए 0.9153 0.9378 0.9264 1141 |
|
ऐ 0.7790 0.8871 0.8295 1089 |
|
ओ 0.9188 0.9581 0.9381 1075 |
|
औ 1.0000 0.9226 0.9598 1021 |
|
क 0.9566 0.9160 0.9358 1083 |
|
क्ष 0.9287 0.9667 0.9473 1200 |
|
ख 0.9913 1.0000 0.9956 1140 |
|
ग 0.9753 0.9982 0.9866 1109 |
|
घ 0.8398 0.7908 0.8146 1200 |
|
च 0.9388 0.9016 0.9198 1158 |
|
छ 0.9764 0.8127 0.8870 1169 |
|
ज 0.9599 0.9967 0.9779 1200 |
|
ज्ञ 0.9878 0.9483 0.9677 1200 |
|
झ 0.9939 0.9567 0.9749 1200 |
|
ट 0.8917 0.8992 0.8954 1200 |
|
ठ 0.9075 0.8425 0.8738 1200 |
|
ड 0.9354 0.9900 0.9619 1200 |
|
ढ 0.8616 0.9025 0.8816 1200 |
|
ण 0.9114 0.9425 0.9267 1200 |
|
त 0.9280 0.9025 0.9151 1200 |
|
थ 0.9388 0.9717 0.9550 1200 |
|
द 0.8648 0.9275 0.8951 1200 |
|
ध 0.9876 0.9917 0.9896 1200 |
|
न 0.7256 0.8967 0.8021 1200 |
|
प 0.9991 0.9683 0.9835 1200 |
|
फ 0.8909 0.8575 0.8739 1200 |
|
ब 0.9814 0.7917 0.8764 1200 |
|
भ 0.9758 0.8383 0.9018 1200 |
|
म 0.8121 0.8142 0.8132 1200 |
|
य 0.5726 0.9133 0.7039 1200 |
|
र 0.7635 0.7339 0.7484 1210 |
|
ल 0.9239 0.8800 0.9014 1200 |
|
ळ 0.8950 0.7533 0.8181 1200 |
|
व 0.9597 0.7542 0.8446 1200 |
|
श 0.8829 0.8667 0.8747 1200 |
|
स 0.8449 0.8758 0.8601 1200 |
|
ह 0.9604 0.8883 0.9229 1200 |
|
|
|
accuracy 0.9027 50099 |
|
macro avg 0.9117 0.9039 0.9051 50099 |
|
weighted avg 0.9107 0.9027 0.9040 50099 |
|
``` |
|
|
|
--- |
|
|
|
## Label Space: 43 Classes |
|
|
|
The model classifies a hand sign into one of the following 43 Marathi characters: |
|
|
|
```json |
|
"id2label": { |
|
"0": "अ", "1": "आ", "2": "इ", "3": "ई", "4": "उ", "5": "ऊ", |
|
"6": "ए", "7": "ऐ", "8": "ओ", "9": "औ", "10": "क", "11": "क्ष", |
|
"12": "ख", "13": "ग", "14": "घ", "15": "च", "16": "छ", "17": "ज", |
|
"18": "ज्ञ", "19": "झ", "20": "ट", "21": "ठ", "22": "ड", "23": "ढ", |
|
"24": "ण", "25": "त", "26": "थ", "27": "द", "28": "ध", "29": "न", |
|
"30": "प", "31": "फ", "32": "ब", "33": "भ", "34": "म", "35": "य", |
|
"36": "र", "37": "ल", "38": "ळ", "39": "व", "40": "श", "41": "स", "42": "ह" |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## Install Dependencies |
|
|
|
```bash |
|
pip install -q transformers torch pillow gradio |
|
``` |
|
|
|
--- |
|
|
|
## Inference Code |
|
|
|
```python |
|
import gradio as gr |
|
from transformers import AutoImageProcessor, SiglipForImageClassification |
|
from PIL import Image |
|
import torch |
|
|
|
# Load model and processor |
|
model_name = "prithivMLmods/Marathi-Sign-Language-Detection" # Replace with actual path |
|
model = SiglipForImageClassification.from_pretrained(model_name) |
|
processor = AutoImageProcessor.from_pretrained(model_name) |
|
|
|
# Marathi label mapping |
|
id2label = { |
|
"0": "अ", "1": "आ", "2": "इ", "3": "ई", "4": "उ", "5": "ऊ", |
|
"6": "ए", "7": "ऐ", "8": "ओ", "9": "औ", "10": "क", "11": "क्ष", |
|
"12": "ख", "13": "ग", "14": "घ", "15": "च", "16": "छ", "17": "ज", |
|
"18": "ज्ञ", "19": "झ", "20": "ट", "21": "ठ", "22": "ड", "23": "ढ", |
|
"24": "ण", "25": "त", "26": "थ", "27": "द", "28": "ध", "29": "न", |
|
"30": "प", "31": "फ", "32": "ब", "33": "भ", "34": "म", "35": "य", |
|
"36": "र", "37": "ल", "38": "ळ", "39": "व", "40": "श", "41": "स", "42": "ह" |
|
} |
|
|
|
def classify_marathi_sign(image): |
|
image = Image.fromarray(image).convert("RGB") |
|
inputs = processor(images=image, return_tensors="pt") |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist() |
|
|
|
prediction = { |
|
id2label[str(i)]: round(probs[i], 3) for i in range(len(probs)) |
|
} |
|
|
|
return prediction |
|
|
|
# Gradio Interface |
|
iface = gr.Interface( |
|
fn=classify_marathi_sign, |
|
inputs=gr.Image(type="numpy"), |
|
outputs=gr.Label(num_top_classes=5, label="Marathi Sign Classification"), |
|
title="Marathi-Sign-Language-Detection", |
|
description="Upload an image of a Marathi sign language hand gesture to identify the corresponding character." |
|
) |
|
|
|
if __name__ == "__main__": |
|
iface.launch() |
|
``` |
|
|
|
--- |
|
|
|
## Intended Use |
|
|
|
Marathi-Sign-Language-Detection can be applied in: |
|
|
|
* Educational platforms for learning regional sign language. |
|
* Assistive communication tools for Marathi-speaking users with hearing impairments. |
|
* Interactive applications that translate signs into text. |
|
* Research and data collection for sign language development and recognition. |