Spaces:
Sleeping
Sleeping
title: LoGoSAM_demo | |
app_file: app.py | |
sdk: gradio | |
sdk_version: 5.29.0 | |
# ProtoSAM - One shot segmentation with foundational models | |
Link to our paper [here](https://arxiv.org/abs/2407.07042). \ | |
This work is the successor of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning) (Link to [Paper](arxiv.org/abs/2403.03273)). | |
## Demo Application | |
A Gradio-based demo application is now available for interactive inference with ProtoSAM. You can upload your own images and masks to test the model. See [README_DEMO.md](README_DEMO.md) for instructions on running the demo. | |
## Abstract | |
This work introduces a new framework, ProtoSAM, for one-shot image segmentation. It combines DINOv2, a vision transformer that extracts features from images, with an Adaptive Local Prototype Pooling (ALP) layer, which generates prototypes from a support image and its mask. These prototypes are used to create an initial coarse segmentation mask by comparing the query image's features with the prototypes. | |
Following the extraction of an initial mask, we use numerical methods to generate prompts, such as points and bounding boxes, which are then input into the Segment Anything Model (SAM), a prompt-based segmentation model trained on natural images. This allows segmenting new classes automatically and effectively without the need for additional training. | |
## How To Run | |
### 1. Data preprocessing | |
#### 1.1 CT and MRI Dataset | |
Please see the notebook `data/data_processing.ipynb` for instructions. | |
For convenience i've compiled the data processing instructions from https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation to a single notebook. \ | |
The CT dataset is available here: https://www.synapse.org/Synapse:syn3553734 \ | |
The MRI dataset is availabel here: https://chaos.grand-challenge.org | |
run `./data/CHAOST2/dcm_img_to_nii.sh` to convert dicom images to nifti files. | |
#### 1.2 Polyp Dataset | |
Data is available here: https://www.kaggle.com/datasets/hngphmv/polypdataset?select=train.csv | |
Put the dataset `data/PolypDataset/` | |
### 2. Running | |
#### 2.1 (Optional) Training and Validation of the coarse segmentation networks | |
``` | |
./backbone.sh [MODE] [MODALITY] [LABEL_SET] | |
``` | |
MODE - validation or training \ | |
MODALITY - ct or mri \ | |
LABEL_SET - 0 (kidneys), 1 (liver spleen) | |
for example: | |
``` | |
./backbone.sh training mri 1 | |
``` | |
Please refer to `backbone.sh` for further configurations. | |
#### 2.1 Running ProtoSAM | |
Put all SAM checkpoint like sam_vit_b.pth, sam_vit_h.pth, medsam_vit_b.pth into the `pretrained_model` directory. \ | |
Checkpoints are available at [SAM](https://github.com/facebookresearch/segment-anything) and [MedSAM](https://github.com/bowang-lab/MedSAM). | |
``` | |
./run_protosam.sh [MODALITY] [LABEL_SET] | |
``` | |
MODALITY - ct, mri or polyp \ | |
LABEL_SET (only relevant if doing ct or mri) - 0 (kidneys), 1 (liver spleen) | |
Please refer to the `run_protosam.sh` script for further configurations. | |
## Acknowledgements | |
This work is largely based on [ALPNet](https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation), [DINOv2](https://github.com/facebookresearch/dinov2), [SAM](https://github.com/facebookresearch/segment-anything) and is a continuation of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning). | |
## Cite | |
If you found this repo useful, please consider giving us a citation and a star! | |
```bibtex | |
@article{ayzenberg2024protosam, | |
title={ProtoSAM-One Shot Medical Image Segmentation With Foundational Models}, | |
author={Ayzenberg, Lev and Giryes, Raja and Greenspan, Hayit}, | |
journal={arXiv preprint arXiv:2407.07042}, | |
year={2024} | |
} | |
@misc{ayzenberg2024dinov2, | |
title={DINOv2 based Self Supervised Learning For Few Shot Medical Image Segmentation}, | |
author={Lev Ayzenberg and Raja Giryes and Hayit Greenspan}, | |
year={2024}, | |
eprint={2403.03273}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CV} | |
} | |
``` | |
# ProtoSAM Segmentation Demo | |
This Streamlit application demonstrates the capabilities of the ProtoSAM model for few-shot segmentation. Users can upload a query image, support image, and support mask to generate a segmentation prediction. | |
## Requirements | |
- Python 3.8 or higher | |
- CUDA-compatible GPU | |
- Required Python packages (see `requirements.txt`) | |
## Setup Instructions | |
1. Clone this repository: | |
```bash | |
git clone <your-repository-url> | |
cd <repository-name> | |
``` | |
2. Create and activate a virtual environment (optional but recommended): | |
```bash | |
python -m venv venv | |
source venv/bin/activate # On Windows: venv\Scripts\activate | |
``` | |
3. Install the required dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
4. Download the pretrained models: | |
```bash | |
mkdir -p pretrained_model | |
# Download SAM ViT-H model | |
wget -P pretrained_model https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth | |
mv pretrained_model/sam_vit_h_4b8939.pth pretrained_model/sam_vit_h.pth | |
``` | |
5. Update the model path in `app.py`: | |
- Set the `reload_model_path` in the config dictionary to the path of your trained ProtoSAM model. | |
## Running the App | |
Start the Streamlit app with: | |
```bash | |
streamlit run app.py | |
``` | |
This will open a browser window with the interface for the segmentation demo. | |
## Usage | |
1. Upload a query image (the image you want to segment) | |
2. Upload a support image (an example image with a similar object) | |
3. Upload a support mask (the segmentation mask for the support image) | |
4. Use the sidebar to configure the model parameters if needed | |
5. Click "Run Inference" to generate the segmentation result | |
## Model Configuration | |
The app allows you to configure several model parameters via the sidebar: | |
- Use Bounding Box: Enable/disable bounding box input | |
- Use Points: Enable/disable point input | |
- Use Mask: Enable/disable mask input | |
- Use CCA: Enable/disable Connected Component Analysis | |
- Coarse Prediction Only: Use only the coarse segmentation model without SAM refinement | |
## Notes | |
- This demo requires a GPU with CUDA support | |
- Large images may require more GPU memory | |
- For optimal results, use high-quality support images and masks | |