metadata

title: Utensil Detector
emoji: 🍽️
colorFrom: pink
colorTo: purple
sdk: streamlit
sdk_version: 1.32.0
app_file: app/app.py
pinned: false

🍽️ Utensils Object Detection System

Welcome to Utensils Object Detection System — an end-to-end pipeline that detects Utensils items like plates, glasses, spoons, and forkss using a custom-trained deep learning model.

This project was built from scratch (no Roboflow or auto-annotation tools!) and demonstrates a full lifecycle: dataset creation, model training, performance evaluation, and an interactive demo app.

🏗️ Project Overview

We set out to solve a real-world problem:

“Can we reliably detect common Utensils items in images, videos, or real-time webcam streams using only a small, custom-labeled dataset?”

To achieve this, we: ✅ Collected & annotated a custom dataset (100–500 images) ✅ Built a clean Python codebase to handle training, inference, and deployment ✅ Delivered an interactive demo using Streamlit / Flask

📁 Project Structure

├── app/                # Streamlit or Flask app for demo
│   └── app.py
├── dataset/            # Custom dataset (images + labels)
│   ├── images/
│   └── labels/
├── inference/          # Inference scripts (image, video, webcam)
│   ├── detect_image.py
│   ├── detect_video.py
│   └── detect_webcam.py
├── runs/detect/        # Training results & saved weights
│   ├── weights/
│   ├── results.png
│   └── Other Metrics ...
├── training/           # Training pipeline
│   ├── train.py
│   └── model_training.ipynb
├── data.yaml           # Dataset config
├── requirements.txt    # Python dependencies
└── README.md           # This file

🗂️ Dataset

Images collected: Manually photographed or sourced from public domain (Kaggle)
Classes: Example — plate, fork, spoon, glass
Annotation tool: LabelImg
Format: YOLO txt labels

🏋️‍♂️ Model Training

Framework: YOLOv8
Training script: training/train.py
Best checkpoint: runs/detect/weights/best.pt
Metrics logged: loss curves, mAP, precision, recall, F1

🔍 Inference & Results

Run detection on:
- Static images → inference/detect_image.py
- Video files → inference/detect_video.py
- Real-time webcam → inference/detect_webcam.py
Visual outputs include:
- Bounding boxes with class names and confidence
- Confusion matrix
- Precision-recall, F1 curves

🌐 Interactive Demo

Launch the demo app:

pip install -r requirements.txt
streamlit run app/app.py

Features:

Upload image or video and get detections
View predicted bounding boxes + class names + confidence scores
(Optional) Real-time webcam support

🚀 Getting Started

1️⃣ Clone the repo:

git clone https://github.com/yourusername/Utensils-object-detection.git
cd Utensils-object-detection

2️⃣ Install dependencies:

pip install -r requirements.txt

3️⃣ Run training:

python training/train.py --data data.yaml

4️⃣ Try inference:

python inference/detect_image.py --source path/to/image.jpg

5️⃣ Launch app: `bash streamlit run app/app.py` Model summary (fused): 92 layers, 25,842,076 parameters, 0 gradients, 78.7 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 3/3 [00:02<00:00, 1.48it/s] all 40 40 0.681 0.725 0.731 0.468 fork 10 10 0.338 0.2 0.265 0.113 glass 10 10 0.643 0.9 0.888 0.432 plate 10 10 1 1 0.995 0.833 spoon 10 10 0.744 0.8 0.776 0.496

📊 Performance

Metric	Value
mAP@0.5	78.0%
mAP@0.5:0.95	50.8%
Precision	85.5%
Recall	67.5%

These numbers are based on our custom dataset; actual results may vary depending on data size and quality.

💡 Challenges & Learnings

Challenge: Small dataset size → risk of overfitting
Solution: Data augmentation and careful validation splitting
Challenge: Labeling errors → noisy annotations
Solution: Manual re-checking of all labels
Challenge: Real-time inference speed
Solution: Optimized image preprocessing pipeline

🛡️ License & Acknowledgments

Built using open-source tools: Ultralytics YOLO, Streamlit
Dataset annotated manually, no pre-annotated sources used
No external pre-trained models on non-custom data

If you like this project, ⭐ the repo and feel free to contribute! Happy detecting! 🍳🍴🥄