Abstract
We present a novel pipeline for road segmentation supervision, using a state-of-the-art vision transformer to tackle two critical challenges: the generalization of a segmentation model worldwide and the training using low-fidelity labels. Specifically, we fine-tune a Segment Anything Model on road segmentation tasks to generate accurate pseudo-labels from OpenStreetMap road centerline prompts. These labels are then used to fine-tune a OneFormer model, pre-trained on publicly available high-fidelity labels from existing aerial and satellite imagery datasets, to improve its generalization capability. Experimental results show that it is possible to extend the application scope of a single binary segmentation model to extract roads anywhere in the world without additional manual annotation, achieving a performance comparable to the state of the art.
Originalsprache | englisch |
---|---|
Titel | German Conference on Pattern Recognition (GCPR) |
Seiten | 1-15 |
Seitenumfang | 15 |
Publikationsstatus | Veröffentlicht - 1 Sept. 2024 |
Veranstaltung | German Conference on Pattern Recognition and the International Symposium on Vision, Modeling, and Visualization, GCPR-VMV 2024 - Munich, Deutschland Dauer: 10 Sept. 2024 → 13 Sept. 2024 |
Konferenz
Konferenz | German Conference on Pattern Recognition and the International Symposium on Vision, Modeling, and Visualization, GCPR-VMV 2024 |
---|---|
Land/Gebiet | Deutschland |
Ort | Munich |
Zeitraum | 10/09/24 → 13/09/24 |