SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Authors:
Jian Song^1,2, Hongruixuan Chen¹, Weihao Xuan^1,2, Junshi Xia², Naoto Yokoya^1,2

¹ The University of Tokyo
² RIKEN AIP

Conference: Neural Information Processing Systems (Spotlight), 2024

For more details, please refer to our paper and visit our GitHub repository.

Overview

TL;DR:
We are excited to release two high-performing models for height estimation and land cover mapping. These models were trained on the SynRS3D dataset using our novel domain adaptation method, RS3DAda.

Encoder: Vision Transformer (ViT-L), pretrained with DINOv2
Decoder: DPT, trained from scratch

These models excel in tasks involving large-scale global 3D semantic understanding from high-resolution remote sensing imagery. Feel free to integrate them into your projects for enhanced performance in related applications.

How to Cite

If you find the RS3DAda model useful in your research, please consider citing:

@article{song2024synrs3d,
title={SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery},
author={Song, Jian and Chen, Hongruixuan and Xuan, Weihao and Xia, Junshi and Yokoya, Naoto},
journal={arXiv preprint arXiv:2406.18151},
year={2024}
}

Contact

For any questions or feedback, please reach out via email at song@ms.k.u-tokyo.ac.jp.

We hope you enjoy using the pretrained RS3DAda models!

JTRNEO
/

RS3DAda

SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Overview

How to Cite

Contact

Model tree for JTRNEO/RS3DAda

Dataset used to train JTRNEO/RS3DAda