GridNet-HD Baseline: Late Fusion MLP on Dual Softmax Outputs
Overview
This repository provides an implementation of a simple Multi-Layer Perceptron (MLP) baseline on the task of late fusing two LiDAR softmax outputs. Before using this baseline, results from the 2 other baselines are required. This repository includes:
- Per-zone preprocessing of two LiDAR softmax files (
image_vote
andspt
) into combined feature tensors - A lightweight
SimpleMLP
model that concatenates the two softmax vectors per point - Training, validation and inference loops
- Weights & Biases integration for real-time experiment tracking
This implementation serves as one of the official baselines for GridNet-HD.
Table of Contents
- Project Structure
- Configuration
- Environment
- Dataset Structure
- Installation
- Supported Modes
- Results
- Pretrained Weights
- Usage Examples
- Weights & Biases Integration
- License
- Contact
- Citation
Project Structure
project_root/
βββ main.py # Entry point (all modes)
βββ config.yaml # Configuration parameters
βββ dataset/
β βββ lidar_dataset.py # dataset for training/validation/test
β βββ preprocess_multi_processing.py # prepare files for training
βββ las_utils/
β βββ io.py # .las reading
β βββ matching.py # Nearest-neighbor matching
βββ model/
β βββ model.py # SimpleMLP definition
βββ train/
β βββ train.py # eval and train loop
β βββ test.py # inference function for test split
βββ utils/
β βββ logging_utils.py
β βββ metrics.py # compute all metrics
βββ requirements.txt # Python dependencies
βββ README.md # This file
Configuration
All training and evaluation settings are stored in a single file: config.yaml
.
Key sections
dataset
Controls data loading, preprocessing, and class remapping.
root
: path to the folder containing all raw zonessplit_file
: path to a JSON file defining train/val/test splitsn_classes
: number of target classesvoxel_size
: downsampling voxel size (used for KDTree)pre-processing_num_workers
: parallelism for data preprocessingmax_point_per_class
: maximum number of points sampled per class during trainingclass_map
: label remapping rules (original β new class)
training
Hyperparameters and runtime configuration.
batch_size
,epochs
,learning_rate
,weight_decay
lr_step_size
,lr_gamma
: learning rate scheduler (StepLR)device
:"cuda"
or"cpu"
model
Defines the MLP structure
hidden_dims
: list of layer widthsignore_index
: label to ignore during loss computation
logging
Output and checkpoint configuration.
save_dir
: where to store logs and model weightssave_freq
: save checkpoint every N epochs
wandb
Weights & Biases experiment tracking.
project
: W&B project nameentity
: your W&B team or username
Environment
Component | Details |
---|---|
GPU | NVIDIA A40 (48 GB VRAM) |
CUDA Version | 12.x |
OS | Ubuntu 22.04 LTS |
RAM | 256 GB |
Dataset structure
The structure of the GridNet-HD dataset remains the same (see GridNet-HD dataset for more information) Raw zones (36 folders) are completed with the results from the 2 other baselines (soft-log LiDAR from ImageVote and SPT):
/path/to/data/
βββ t1z5b/
β βββ lidar_softmax_image_vote/t1z4_with_softmax.las # LiDAR with soft-log from ImageVote baseline
β βββ lidar_softmax_spt/t1z4_with_softmax.las # LiDAR with soft-log from SPT baseline
β βββ lidar/t1z4.las # ground-truth
βββ β¦
βββ split.json # maps zones β train/val/test
After preprocessing:
/path/to/data/preprocessed/
βββ t1z4.pt # contains {features and labels}
βββ t1z5a.pt
βββ β¦
Installation
Clone the repository:
git clone https://github.com/your-org/baseline_fusion_mlp.git cd baseline_fusion_mlp
Create a conda virtual environment:
conda create -n gridnet_hd_mlp python=3.12 conda activate gridnet_hd_mlp
Install dependencies:
pip install --upgrade pip pip install -r requirements.txt
Supported modes
Use --mode in main.py:
Mode | Description |
---|---|
preprocess |
Convert all zones .las β .pt with remapping |
train |
Train SimpleMLP on train split |
val |
Validate model on val split |
test |
Evaluate on test split |
Results
The following table summarizes the per-class Intersection over Union (IoU) scores on the test set at 3D level for the best model.
Class | IoU (Test set) (%) |
---|---|
Pylon | 94.82 |
Conductor cable | 94.40 |
Structural cable | 82.52 |
Insulator | 86.98 |
High vegetation | 83.08 |
Low vegetation | 47.64 |
Herbaceous vegetation | 80.75 |
Rock, gravel, soil | 42.89 |
Impervious soil (Road) | 80.26 |
Water | 61.69 |
Building | 61.40 |
Mean IoU (mIoU) | 74.22 |
Pretrained Weights
Checkpoints for the best-performing model (mIoU = 74.22%) are available directly in the repository.
Usage examples
Before training the model, use the preprocess
mode and configure the config.yaml
file accordingly.
Preprocessing
python main.py --mode preprocess --config config.yaml
This will concatenate features from SPT soft-log and ImageVote soft-log, apply remapping, and prepare files for training.
Training
python main.py --mode train --config config.yaml
Trains the MLP late fusion model using the dataset and settings defined in config.yaml. Checkpoints and logs are saved under logging.save_dir.
Validation
python main.py --mode val --config config.yaml --weights best_model.pt
Evaluates the model on the validation set and prints out per-class IoUs and mIoU.
Test (Las export)
python main.py --mode test --config config.yaml --weights best_model.pt
Runs inference on the test set and exports the original .las files with the field classification, which contains the predicted class label for each point.
Weights & Biases Integration
- Login:
wandb login
- Set logging.wandb.project & .entity in config.yaml.
All training and validation metrics will be tracked live.
License
This project is released under the MIT License.
Contact
For questions, issues, or contributions, please open an issue on the repository.
Citation
If you use this repo in research, please cite:
GridNet-HD: A High-Resolution Multi-Modal Dataset for LiDAR-Image Fusion on Power Line Infrastructure
Masked Authors
Submitted to NeurIPS 2025.
- Downloads last month
- 5