Spaces:
Sleeping
Sleeping
File size: 6,313 Bytes
427d150 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
title: LoGoSAM_demo
app_file: app.py
sdk: gradio
sdk_version: 5.29.0
---
# ProtoSAM - One shot segmentation with foundational models
Link to our paper [here](https://arxiv.org/abs/2407.07042). \
This work is the successor of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning) (Link to [Paper](arxiv.org/abs/2403.03273)).
## Demo Application
A Gradio-based demo application is now available for interactive inference with ProtoSAM. You can upload your own images and masks to test the model. See [README_DEMO.md](README_DEMO.md) for instructions on running the demo.
## Abstract
This work introduces a new framework, ProtoSAM, for one-shot image segmentation. It combines DINOv2, a vision transformer that extracts features from images, with an Adaptive Local Prototype Pooling (ALP) layer, which generates prototypes from a support image and its mask. These prototypes are used to create an initial coarse segmentation mask by comparing the query image's features with the prototypes.
Following the extraction of an initial mask, we use numerical methods to generate prompts, such as points and bounding boxes, which are then input into the Segment Anything Model (SAM), a prompt-based segmentation model trained on natural images. This allows segmenting new classes automatically and effectively without the need for additional training.
## How To Run
### 1. Data preprocessing
#### 1.1 CT and MRI Dataset
Please see the notebook `data/data_processing.ipynb` for instructions.
For convenience i've compiled the data processing instructions from https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation to a single notebook. \
The CT dataset is available here: https://www.synapse.org/Synapse:syn3553734 \
The MRI dataset is availabel here: https://chaos.grand-challenge.org
run `./data/CHAOST2/dcm_img_to_nii.sh` to convert dicom images to nifti files.
#### 1.2 Polyp Dataset
Data is available here: https://www.kaggle.com/datasets/hngphmv/polypdataset?select=train.csv
Put the dataset `data/PolypDataset/`
### 2. Running
#### 2.1 (Optional) Training and Validation of the coarse segmentation networks
```
./backbone.sh [MODE] [MODALITY] [LABEL_SET]
```
MODE - validation or training \
MODALITY - ct or mri \
LABEL_SET - 0 (kidneys), 1 (liver spleen)
for example:
```
./backbone.sh training mri 1
```
Please refer to `backbone.sh` for further configurations.
#### 2.1 Running ProtoSAM
Put all SAM checkpoint like sam_vit_b.pth, sam_vit_h.pth, medsam_vit_b.pth into the `pretrained_model` directory. \
Checkpoints are available at [SAM](https://github.com/facebookresearch/segment-anything) and [MedSAM](https://github.com/bowang-lab/MedSAM).
```
./run_protosam.sh [MODALITY] [LABEL_SET]
```
MODALITY - ct, mri or polyp \
LABEL_SET (only relevant if doing ct or mri) - 0 (kidneys), 1 (liver spleen)
Please refer to the `run_protosam.sh` script for further configurations.
## Acknowledgements
This work is largely based on [ALPNet](https://github.com/cheng-01037/Self-supervised-Fewshot-Medical-Image-Segmentation), [DINOv2](https://github.com/facebookresearch/dinov2), [SAM](https://github.com/facebookresearch/segment-anything) and is a continuation of [DINOv2-based-Self-Supervised-Learning](https://github.com/levayz/DINOv2-based-Self-Supervised-Learning).
## Cite
If you found this repo useful, please consider giving us a citation and a star!
```bibtex
@article{ayzenberg2024protosam,
title={ProtoSAM-One Shot Medical Image Segmentation With Foundational Models},
author={Ayzenberg, Lev and Giryes, Raja and Greenspan, Hayit},
journal={arXiv preprint arXiv:2407.07042},
year={2024}
}
@misc{ayzenberg2024dinov2,
title={DINOv2 based Self Supervised Learning For Few Shot Medical Image Segmentation},
author={Lev Ayzenberg and Raja Giryes and Hayit Greenspan},
year={2024},
eprint={2403.03273},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
# ProtoSAM Segmentation Demo
This Streamlit application demonstrates the capabilities of the ProtoSAM model for few-shot segmentation. Users can upload a query image, support image, and support mask to generate a segmentation prediction.
## Requirements
- Python 3.8 or higher
- CUDA-compatible GPU
- Required Python packages (see `requirements.txt`)
## Setup Instructions
1. Clone this repository:
```bash
git clone <your-repository-url>
cd <repository-name>
```
2. Create and activate a virtual environment (optional but recommended):
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install the required dependencies:
```bash
pip install -r requirements.txt
```
4. Download the pretrained models:
```bash
mkdir -p pretrained_model
# Download SAM ViT-H model
wget -P pretrained_model https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
mv pretrained_model/sam_vit_h_4b8939.pth pretrained_model/sam_vit_h.pth
```
5. Update the model path in `app.py`:
- Set the `reload_model_path` in the config dictionary to the path of your trained ProtoSAM model.
## Running the App
Start the Streamlit app with:
```bash
streamlit run app.py
```
This will open a browser window with the interface for the segmentation demo.
## Usage
1. Upload a query image (the image you want to segment)
2. Upload a support image (an example image with a similar object)
3. Upload a support mask (the segmentation mask for the support image)
4. Use the sidebar to configure the model parameters if needed
5. Click "Run Inference" to generate the segmentation result
## Model Configuration
The app allows you to configure several model parameters via the sidebar:
- Use Bounding Box: Enable/disable bounding box input
- Use Points: Enable/disable point input
- Use Mask: Enable/disable mask input
- Use CCA: Enable/disable Connected Component Analysis
- Coarse Prediction Only: Use only the coarse segmentation model without SAM refinement
## Notes
- This demo requires a GPU with CUDA support
- Large images may require more GPU memory
- For optimal results, use high-quality support images and masks
|