LoGoSAM_demo / README_DEMO.md
quandn2003's picture
Upload folder using huggingface_hub
427d150 verified

A newer version of the Gradio SDK is available: 5.33.1

Upgrade

ProtoSAM Segmentation Demo

This Gradio application demonstrates the capabilities of the ProtoSAM model for few-shot segmentation. Users can upload a query image, support image, and support mask to generate a segmentation prediction.

Requirements

  • Python 3.8 or higher
  • CUDA-compatible GPU
  • Required Python packages (see requirements.txt)

Setup Instructions

  1. Clone this repository:
git clone <your-repository-url>
cd <repository-name>
  1. Create and activate a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install the required dependencies:
pip install -r requirements.txt
  1. Download the pretrained models:
mkdir -p pretrained_model
# Download SAM ViT-H model
wget -P pretrained_model https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
mv pretrained_model/sam_vit_h_4b8939.pth pretrained_model/sam_vit_h.pth
  1. Update the model path in app.py:
    • Set the reload_model_path in the config dictionary to the path of your trained ProtoSAM model.

Running the App

Start the app with:

./run_demo.sh

Or run it directly with:

python app.py

This will start the server and provide a link to access the demo in your browser.

Usage

  1. Upload a query image (the image you want to segment)
  2. Upload a support image (an example image with a similar object)
  3. Upload a support mask (the segmentation mask for the support image)
  4. Configure the model parameters using the checkboxes
  5. Click "Run Inference" to generate the segmentation result

Model Configuration

The app allows you to configure several model parameters:

  • Use Bounding Box: Enable/disable bounding box input
  • Use Points: Enable/disable point input
  • Use Mask: Enable/disable mask input
  • Use CCA: Enable/disable Connected Component Analysis
  • Coarse Prediction Only: Use only the coarse segmentation model without SAM refinement

Notes

  • This demo requires a GPU with CUDA support
  • Large images may require more GPU memory
  • For optimal results, use high-quality support images and masks