MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

The sensing process of large-scale LiDAR point clouds inevitably causes large blind spots, i.e. regions not visible to the sensor. We demonstrate how these inherent sampling properties can be effectively utilized for self-supervised representation learning by designing a highly effective pre-training framework that considerably reduces the need for tedious 3D annotations to train state-of-the-art object detectors. Our Masked AutoEncoder for LiDAR point clouds (MAELi) intuitively leverages the sparsity of LiDAR point clouds in both the encoder and decoder during reconstruction. This results in more expressive and useful initialization, which can be directly applied to downstream perception tasks, such as 3D object detection or semantic segmentation for autonomous driving. In a novel reconstruction approach, MAELi distinguishes between empty and occluded space and employs a new masking strategy that targets the LiDAR's inherent spherical projection. Thereby, without any ground truth whatsoever and trained on single frames only, MAELi obtains an understanding of the underlying 3D scene geometry and semantics. To demonstrate the potential of MAELi, we pre-train backbones in an end-to-end manner and show the effectiveness of our unsupervised pre-trained weights on the tasks of 3D object detection and semantic segmentation.
Translated title of the contributionMAELi: Maskierter Autoencoder für umfangreiche LiDAR-Punktwolken
Original languageEnglish
Title of host publicationProceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
Pages3371-3380
Number of pages10
ISBN (Electronic)9798350318920
DOIs
Publication statusPublished - 3 Jan 2024
Event2024 IEEE/CVF Winter Conference on Applications of Computer Vision: WACV 2024 - Waikoloa, United States
Duration: 4 Jan 20248 Jan 2024

Conference

Conference2024 IEEE/CVF Winter Conference on Applications of Computer Vision
Abbreviated titleWACV 2024
Country/TerritoryUnited States
CityWaikoloa
Period4/01/248/01/24

Keywords

  • autonomous driving
  • self-supervised learning
  • 3D object detection
  • 3D semantic segmentation
  • representation learning
  • Algorithms
  • formulations
  • Machine learning architectures
  • 3D computer vision
  • and algorithms

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds'. Together they form a unique fingerprint.

Cite this