groffo commited on
Commit
070887a
Β·
1 Parent(s): d7087a8

Improve Hugging Face model card formatting and content

Browse files
Files changed (1) hide show
  1. README.md +49 -88
README.md CHANGED
@@ -1,127 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
  # πŸ”¬ Feature Selection Gates (FSG) for Vision Transformers (ViT)
2
 
3
- This repository provides a modular, extensible PyTorch implementation of **Feature Selection Gates (FSG)** with **Gradient Routing (GR)**, integrated into **Vision Transformers (ViTs)**. The approach is proposed in:
4
 
5
  > **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
6
  > Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
7
- > Presented at MICCAI 2024
8
- > πŸ“„ [Paper](https://papers.miccai.org/miccai-2024/316-Paper0410.html) | 🧠 [arXiv](https://arxiv.org/abs/2407.04400) | πŸ’» [Code](https://github.com/cosmoimd/feature-selection-gates)
9
 
10
  ---
11
 
12
- ## πŸ“Œ What Is FSG?
13
 
14
- **FSG** introduces **learnable gates** that sparsify transformer blocks by modulating residual connections, acting as **online feature selectors**. This process encourages **sparse connectivity**, which reduces overfitting and increases generalization β€” especially valuable in small and imbalanced datasets.
 
 
 
15
 
16
- **Gradient Routing (GR)** enables dual-phase optimization:
17
- - One optimizer updates FSG parameters
18
- - A second optimizer updates the base model
19
- This separation allows **task-specific tuning** and ensures stable learning.
20
 
21
  ---
22
 
23
- ## πŸ’‘ Why Use FSG?
24
 
25
- βœ… **Plug & play**: Can be integrated into **any ViT architecture**
26
- βœ… Works on **natural images**, **medical images**, and beyond
27
- βœ… Can be adapted to **NLP Transformers** like GPTs and BERT
28
- βœ… Lightweight and highly regularizing
29
- βœ… Compatible with **multi-stream CNNs** and hybrid models
30
 
31
- ⚠️ While our focus is on **endoscopic image computing**, the method has shown performance improvements on **CIFAR-100**, proving its applicability to **standard vision tasks**.
 
 
 
32
 
33
  ---
34
 
35
- ## πŸ§ͺ How to Use the FSG Wrapper
36
-
37
- Use the `vit_with_fsg.py` script to augment a pretrained ViT from `torchvision`.
38
 
39
  ```python
40
  from torchvision.models import vit_b_16, ViT_B_16_Weights
41
  from vit_with_fsg import vit_with_fsg
42
  import torch
43
 
44
- print("πŸ“₯ Loading pretrained ViT_B_16...")
45
  backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
46
 
47
- print("πŸ”§ Wrapping with Feature Selection Gates (FSG)...")
48
  model = vit_with_fsg(vit_backbone=backbone)
49
 
50
- print("πŸ§ͺ Running dummy input...")
51
  dummy_input = torch.randn(1, 3, 224, 224)
52
  output = model(dummy_input)
53
-
54
- print("βœ… Done. Output shape:", output.shape)
55
  ```
56
 
57
  ---
58
 
59
- ## πŸš€ Demo Scripts
60
-
61
- We provide full working training and inference examples:
62
-
63
- | Dataset | Training Script | Inference Script | Checkpoint Path |
64
- |-------------|-----------------------------|------------------------------|----------------------------------------------|
65
- | MNIST | `demo_training_mnist.py` | `demo_inference_mnist.py` | `./checkpoints/fsg_vit_mnist_demo.pth` |
66
- | Imagenette | `demo_training_imnet.py` | `demo_inference_imnet.py` | `./checkpoints/fsg_vit_imagenette_demo.pth` |
67
 
68
- Each demo:
69
- - Trains a ViT+B16 with FSG on a reduced dataset for speed.
70
- - Uses separate learning rates for FSG and base model parameters.
71
- - Includes GPU-aware prints and a training progress bar.
72
- - Saves checkpoints for reproducible inference.
73
 
74
- ### ▢️ Example Usage
75
-
76
- ```bash
77
- # Train on Imagenette
78
- python demo_training_imnet.py
79
-
80
- # Inference on Imagenette
81
- python demo_inference_imnet.py --checkpoint ./checkpoints/fsg_vit_imagenette_demo.pth
82
- ```
83
-
84
- ```bash
85
- # Train on MNIST
86
- python demo_training_mnist.py
87
-
88
- # Inference on MNIST
89
- python demo_inference_mnist.py --checkpoint ./checkpoints/fsg_vit_mnist_demo.pth
90
- ```
91
-
92
- > ⚠️ These demos use reduced test sets and train for few iterations to make training quick. They're not meant for benchmarking, but rather for showcasing FSG integration.
93
-
94
- ---
95
-
96
- ## 🧠 Applicability Beyond Endoscopy
97
-
98
- Although designed for **polyp size estimation in colonoscopy**, FSG is a **general mechanism** for:
99
- - **Image classification**
100
- - **Medical image analysis**
101
- - **Multimodal fusion**
102
- - **NLP Transformers** (e.g., GPTs, BERT) β€” apply FSG over token embeddings
103
-
104
- We strongly encourage researchers to test FSG in **non-medical** domains.
105
 
106
  ---
107
 
108
- ## πŸ“¦ Files and Structure
109
 
110
  ```
111
  .
112
- β”œβ”€β”€ vit_with_fsg.py # ViT + FSG wrapper
113
  β”œβ”€β”€ demo_training_mnist.py
114
  β”œβ”€β”€ demo_inference_mnist.py
115
  β”œβ”€β”€ demo_training_imnet.py
116
  β”œβ”€β”€ demo_inference_imnet.py
117
- β”œβ”€β”€ checkpoints/ # Folder for .pth checkpoints
 
118
  ```
119
 
120
  ---
121
 
122
  ## πŸ“š Citation
123
 
124
- Please cite our work if you use this repository:
125
 
126
  ```bibtex
127
  @inproceedings{roffo2024FSG,
@@ -137,20 +111,7 @@ Please cite our work if you use this repository:
137
 
138
  ## πŸ“¬ Contact
139
 
140
- Lead Author: **Giorgio Roffo**
141
  πŸ“§ giorgio.roffo@gmail.com
142
- 🏒 Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
143
-
144
- For more: [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)
145
-
146
- ---
147
- tags:
148
- - vision
149
- - transformers
150
- - vit
151
- - feature-selection
152
- - miccai2024
153
- license: mit
154
- library_name: PyTorch
155
- inference: false
156
- ---
 
1
+ ---
2
+ tags:
3
+ - vision
4
+ - transformers
5
+ - vit
6
+ - feature-selection
7
+ - miccai2024
8
+ license: mit
9
+ library_name: PyTorch
10
+ inference: false
11
+ ---
12
+
13
  # πŸ”¬ Feature Selection Gates (FSG) for Vision Transformers (ViT)
14
 
15
+ This repository implements **Feature Selection Gates (FSG)** and **Gradient Routing (GR)** as a modular extension to Vision Transformers. It is based on our paper presented at **MICCAI 2024**:
16
 
17
  > **Feature Selection Gates with Gradient Routing for Endoscopic Image Computing**
18
  > Giorgio Roffo, Carlo Biffi, Pietro Salvagnini, Andrea Cherubini
19
+ > [MICCAI 2024](https://papers.miccai.org/miccai-2024/316-Paper0410.html), [arXiv](https://arxiv.org/abs/2407.04400), [GitHub](https://github.com/cosmoimd/feature-selection-gates)
 
20
 
21
  ---
22
 
23
+ ## 🧠 What Is FSG?
24
 
25
+ **FSG** introduces learnable gates on residual branches within Transformer layers. These gates:
26
+ - Dynamically select relevant features
27
+ - Promote **sparse connectivity** during training
28
+ - Serve as a form of **architectural regularization**
29
 
30
+ To stabilize learning, **Gradient Routing (GR)** performs a **dual-pass** strategy:
31
+ - One forward pass to compute gradients for the base model
32
+ - A separate route to update FSG parameters independently
 
33
 
34
  ---
35
 
36
+ ## πŸ’‘ Key Features
37
 
38
+ - βœ… **Drop-in**: Easily wraps any `torchvision` ViT model (e.g. `vit_b_16`, `vit_l_16`)
39
+ - βœ… **General-purpose**: Use on **natural images**, **medical data**, and even **token sequences in NLP**
40
+ - βœ… **Regularizes ViTs** for low-data regimes (tested on CIFAR-100, endoscopic videos, etc.)
41
+ - βœ… No ViT surgery: FSG wraps Transformer layers directly
 
42
 
43
+ While this method was originally proposed for **polyp size estimation in colonoscopy**, it is designed to generalize across:
44
+ - 🧬 Medical image analysis
45
+ - πŸ–ΌοΈ General image classification
46
+ - πŸ“š NLP Transformers (e.g. GPT, BERT)
47
 
48
  ---
49
 
50
+ ## πŸ§ͺ Minimal Example
 
 
51
 
52
  ```python
53
  from torchvision.models import vit_b_16, ViT_B_16_Weights
54
  from vit_with_fsg import vit_with_fsg
55
  import torch
56
 
57
+ print("πŸ“₯ Loading pretrained ViT...")
58
  backbone = vit_b_16(weights=ViT_B_16_Weights.DEFAULT)
59
 
60
+ print("πŸ”§ Injecting FSG into backbone...")
61
  model = vit_with_fsg(vit_backbone=backbone)
62
 
 
63
  dummy_input = torch.randn(1, 3, 224, 224)
64
  output = model(dummy_input)
65
+ print("βœ… Output shape:", output.shape)
 
66
  ```
67
 
68
  ---
69
 
70
+ ## πŸ§ͺ Demos (Quick Training + Inference)
 
 
 
 
 
 
 
71
 
72
+ | Dataset | Training Script | Inference Script | Checkpoint Path |
73
+ |-------------|-----------------------------|------------------------------|-----------------------------------------------|
74
+ | MNIST | `demo_training_mnist.py` | `demo_inference_mnist.py` | `./checkpoints/fsg_vit_mnist_demo.pth` |
75
+ | Imagenette | `demo_training_imnet.py` | `demo_inference_imnet.py` | `./checkpoints/fsg_vit_imagenette_demo.pth` |
 
76
 
77
+ > ⚠️ These demos use reduced datasets and epochs to run quickly and demonstrate the API.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
 
79
  ---
80
 
81
+ ## πŸ“¦ Project Structure
82
 
83
  ```
84
  .
85
+ β”œβ”€β”€ vit_with_fsg.py # FSG-ViT integration
86
  β”œβ”€β”€ demo_training_mnist.py
87
  β”œβ”€β”€ demo_inference_mnist.py
88
  β”œβ”€β”€ demo_training_imnet.py
89
  β”œβ”€β”€ demo_inference_imnet.py
90
+ β”œβ”€β”€ checkpoints/ # Model weights (optional)
91
+ β”œβ”€β”€ README.md # This model card
92
  ```
93
 
94
  ---
95
 
96
  ## πŸ“š Citation
97
 
98
+ If you use this project, please cite our work:
99
 
100
  ```bibtex
101
  @inproceedings{roffo2024FSG,
 
111
 
112
  ## πŸ“¬ Contact
113
 
114
+ **Giorgio Roffo**
115
  πŸ“§ giorgio.roffo@gmail.com
116
+ 🏒 Cosmo Intelligent Medical Devices (IMD), Lainate, Italy
117
+ πŸ”— [github.com/cosmoimd/feature-selection-gates](https://github.com/cosmoimd/feature-selection-gates)