English

SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Authors:
Jian Song1,2, Hongruixuan Chen1, Weihao Xuan1,2, Junshi Xia2, Naoto Yokoya1,2

1 The University of Tokyo
2 RIKEN AIP

Conference: Neural Information Processing Systems (Spotlight), 2024

For more details, please refer to our paper and visit our GitHub repository.


Overview

TL;DR:
We are excited to release two high-performing models for height estimation and land cover mapping. These models were trained on the SynRS3D dataset using our novel domain adaptation method, RS3DAda.

  • Encoder: Vision Transformer (ViT-L), pretrained with DINOv2
  • Decoder: DPT, trained from scratch

These models excel in tasks involving large-scale global 3D semantic understanding from high-resolution remote sensing imagery. Feel free to integrate them into your projects for enhanced performance in related applications.


How to Cite

If you find the RS3DAda model useful in your research, please consider citing:

@article{song2024synrs3d,
title={SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery},
author={Song, Jian and Chen, Hongruixuan and Xuan, Weihao and Xia, Junshi and Yokoya, Naoto},
journal={arXiv preprint arXiv:2406.18151},
year={2024}
}

Contact

For any questions or feedback, please reach out via email at song@ms.k.u-tokyo.ac.jp.

We hope you enjoy using the pretrained RS3DAda models!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JTRNEO/RS3DAda

Finetuned
(20)
this model

Dataset used to train JTRNEO/RS3DAda