groffo
/

fsg-vit-roffo

Transformers

PyTorch

vision

vit

feature-selection

miccai2024

Model card Files Files and versions Community

groffo commited on 28 days ago

Commit

070887a

1 Parent(s): d7087a8

Improve Hugging Face model card formatting and content

Browse files

Files changed (1) hide show

README.md +49 -88

README.md CHANGED Viewed

@@ -1,127 +1,101 @@
 # 🔬 Feature Selection Gates (FSG) for Vision Transformers (ViT)
-This repository provides a modular, extensible PyTorch implementation of **Feature Selection Gates (FSG)** with **Gradient Routing (GR)**, integrated into **Vision Transformers (ViTs)**. The approach is proposed in:
 > **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
 > Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
-> Presented at MICCAI 2024
-> 📄 [Paper](https://papers.miccai.org/miccai-2024/316-Paper0410.html) | 🧠 [arXiv](https://arxiv.org/abs/2407.04400) | 💻 [Code](https://github.com/cosmoimd/feature-selection-gates)
 ---
-## 📌 What Is FSG?
-**FSG** introduces **learnable gates** that sparsify transformer blocks by modulating residual connections, acting as **online feature selectors**. This process encourages **sparse connectivity**, which reduces overfitting and increases generalization — especially valuable in small and imbalanced datasets.
-**Gradient Routing (GR)** enables dual-phase optimization:
-- One optimizer updates FSG parameters
-- A second optimizer updates the base model
-This separation allows **task-specific tuning** and ensures stable learning.
 ---
-## 💡 Why Use FSG?
-✅ **Plug & play**: Can be integrated into **any ViT architecture**
-✅ Works on **natural images**, **medical images**, and beyond
-✅ Can be adapted to **NLP Transformers** like GPTs and BERT
-✅ Lightweight and highly regularizing
-✅ Compatible with **multi-stream CNNs** and hybrid models
-⚠️ While our focus is on **endoscopic image computing**, the method has shown performance improvements on **CIFAR-100**, proving its applicability to **standard vision tasks**.
 ---
-## 🧪 How to Use the FSG Wrapper
-Use the `vit_with_fsg.py` script to augment a pretrained ViT from `torchvision`.
 ```python
 from torchvision.models import vit_b_16, ViT_B_16_Weights
 from vit_with_fsg import vit_with_fsg
 import torch
-print("📥 Loading pretrained ViT_B_16...")
 backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
-print("🔧 Wrapping with Feature Selection Gates (FSG)...")
 model = vit_with_fsg(vit_backbone=backbone)
-print("🧪 Running dummy input...")
 dummy_input = torch.randn(1, 3, 224, 224)
 output = model(dummy_input)
-print("✅ Done. Output shape:", output.shape)
 ```
 ---
-## 🚀 Demo Scripts
-We provide full working training and inference examples:
-| Dataset     | Training Script            | Inference Script            | Checkpoint Path                             |
-|-------------|-----------------------------|------------------------------|----------------------------------------------|
-| MNIST       | `demo_training_mnist.py`    | `demo_inference_mnist.py`    | `./checkpoints/fsg_vit_mnist_demo.pth`       |
-| Imagenette  | `demo_training_imnet.py`    | `demo_inference_imnet.py`    | `./checkpoints/fsg_vit_imagenette_demo.pth`  |
-Each demo:
-- Trains a ViT+B16 with FSG on a reduced dataset for speed.
-- Uses separate learning rates for FSG and base model parameters.
-- Includes GPU-aware prints and a training progress bar.
-- Saves checkpoints for reproducible inference.
-### ▶️ Example Usage
-```bash
-# Train on Imagenette
-python demo_training_imnet.py
-# Inference on Imagenette
-python demo_inference_imnet.py --checkpoint ./checkpoints/fsg_vit_imagenette_demo.pth
-```
-```bash
-# Train on MNIST
-python demo_training_mnist.py
-# Inference on MNIST
-python demo_inference_mnist.py --checkpoint ./checkpoints/fsg_vit_mnist_demo.pth
-```
-> ⚠️ These demos use reduced test sets and train for few iterations to make training quick. They're not meant for benchmarking, but rather for showcasing FSG integration.
----
-## 🧠 Applicability Beyond Endoscopy
-Although designed for **polyp size estimation in colonoscopy**, FSG is a **general mechanism** for:
-- **Image classification**
-- **Medical image analysis**
-- **Multimodal fusion**
-- **NLP Transformers** (e.g., GPTs, BERT) — apply FSG over token embeddings
-We strongly encourage researchers to test FSG in **non-medical** domains.
 ---
-## 📦 Files and Structure
 ```
 .
-├── vit_with_fsg.py                  # ViT + FSG wrapper
 ├── demo_training_mnist.py
 ├── demo_inference_mnist.py
 ├── demo_training_imnet.py
 ├── demo_inference_imnet.py
-├── checkpoints/                    # Folder for .pth checkpoints
 ```
 ---
 ## 📚 Citation
-Please cite our work if you use this repository:
 ```bibtex
 @inproceedings{roffo2024FSG,
@@ -137,20 +111,7 @@ Please cite our work if you use this repository:
 ## 📬 Contact
-Lead Author: **Giorgio Roffo**
 📧 giorgio.roffo@gmail.com
-🏢 Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
-For more: [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)
----
-tags:
-- vision
-- transformers
-- vit
-- feature-selection
-- miccai2024
-license: mit
-library_name: PyTorch
-inference: false
----

+---
+tags:
+- vision
+- transformers
+- vit
+- feature-selection
+- miccai2024
+license: mit
+library_name: PyTorch
+inference: false
+---
 # 🔬 Feature Selection Gates (FSG) for Vision Transformers (ViT)
+This repository implements **Feature Selection Gates (FSG)** and **Gradient Routing (GR)** as a modular extension to Vision Transformers. It is based on our paper presented at **MICCAI 2024**:
 > **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
 > Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
+> [MICCAI 2024](https://papers.miccai.org/miccai-2024/316-Paper0410.html), [arXiv](https://arxiv.org/abs/2407.04400), [GitHub](https://github.com/cosmoimd/feature-selection-gates)
 ---
+## 🧠 What Is FSG?
+**FSG** introduces learnable gates on residual branches within Transformer layers. These gates:
+- Dynamically select relevant features
+- Promote **sparse connectivity** during training
+- Serve as a form of **architectural regularization**
+To stabilize learning, **Gradient Routing (GR)** performs a **dual-pass** strategy:
+- One forward pass to compute gradients for the base model
+- A separate route to update FSG parameters independently
 ---
+## 💡 Key Features
+- ✅ **Drop-in**: Easily wraps any `torchvision` ViT model (e.g. `vit_b_16`, `vit_l_16`)
+- ✅ **General-purpose**: Use on **natural images**, **medical data**, and even **token sequences in NLP**
+- ✅ **Regularizes ViTs** for low-data regimes (tested on CIFAR-100, endoscopic videos, etc.)
+- ✅ No ViT surgery: FSG wraps Transformer layers directly
+While this method was originally proposed for **polyp size estimation in colonoscopy**, it is designed to generalize across:
+- 🧬 Medical image analysis
+- 🖼️ General image classification
+- 📚 NLP Transformers (e.g. GPT, BERT)
 ---
+## 🧪 Minimal Example
 ```python
 from torchvision.models import vit_b_16, ViT_B_16_Weights
 from vit_with_fsg import vit_with_fsg
 import torch
+print("📥 Loading pretrained ViT...")
 backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
+print("🔧 Injecting FSG into backbone...")
 model = vit_with_fsg(vit_backbone=backbone)
 dummy_input = torch.randn(1, 3, 224, 224)
 output = model(dummy_input)
+print("✅ Output shape:", output.shape)
 ```
 ---
+## 🧪 Demos (Quick Training + Inference)
+| Dataset     | Training Script             | Inference Script            | Checkpoint Path                              |
+|-------------|-----------------------------|------------------------------|-----------------------------------------------|
+| MNIST       | `demo_training_mnist.py`    | `demo_inference_mnist.py`    | `./checkpoints/fsg_vit_mnist_demo.pth`        |
+| Imagenette  | `demo_training_imnet.py`    | `demo_inference_imnet.py`    | `./checkpoints/fsg_vit_imagenette_demo.pth`   |
+> ⚠️ These demos use reduced datasets and epochs to run quickly and demonstrate the API.
 ---
+## 📦 Project Structure
 ```
 .
+├── vit_with_fsg.py                  # FSG-ViT integration
 ├── demo_training_mnist.py
 ├── demo_inference_mnist.py
 ├── demo_training_imnet.py
 ├── demo_inference_imnet.py
+├── checkpoints/                    # Model weights (optional)
+├── README.md                       # This model card
 ```
 ---
 ## 📚 Citation
+If you use this project, please cite our work:
 ```bibtex
 @inproceedings{roffo2024FSG,
 ## 📬 Contact
+**Giorgio Roffo**
 📧 giorgio.roffo@gmail.com
+🏢 Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
+🔗 [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)