suryaprakash01
/

dimension_Detect

Image Feature Extraction

computer-vision

object-detection

Model card Files Files and versions

dimension_Detect / readme.txt

suryaprakash01's picture

Update readme.txt

64f21e1 verified 3 months ago

history blame contribute delete

1.69 kB

	# 3D Object Dimension & Volume Estimator

	This project is a Streamlit web application that estimates the 3D dimensions (Length, Width, Height) and Volume of objects from user-uploaded images. It leverages Detectron2 for object detection and instance segmentation, and a custom Convolutional Neural Network (CNN) trained on the Pix3D dataset for dimension regression.

	## Features

	* Upload single or multiple images of an object (different views).
	* Detects objects using a pre-trained Detectron2 (Mask R-CNN) model.
	* Displays segmentation masks and 2D bounding boxes for detected objects.
	* For the largest detected object in each view:
	* Crops the object using its segmentation mask.
	* Feeds the cropped patch to a custom CNN to predict dimensions (L, W, H, V).
	* Displays individual dimension predictions for each view.
	* Calculates and displays aggregated (averaged) dimensions if multiple views are provided.
	* User-friendly web interface built with Streamlit.

	## Models Used

	1. Object Detection & Segmentation:
	* Detectron2 (Mask R-CNN R50-FPN 3x): Pre-trained on the COCO dataset. Used to identify objects and generate pixel-wise segmentation masks.
	2. Dimension Estimation:
	* Custom CNN (ResNet50 backbone): Trained on image patches derived from the Pix3D dataset. The model takes a cropped image patch of an object as input and outputs its estimated Length, Width, Height (in meters), and Volume (in meters³).

	## Setup and Installation

	Follow these steps to set up and run the application locally:

	1. Clone this GitHub Repository:

	```bash
	git clone https://huggingface.co/suryaprakash01/dimension_Detect/edit/main
	cd YourGitHubRepoName