LiDAR-based 3D Perception from Multi-frame Point Clouds for Autonomous Driving

Huang, Chengjie

LiDAR-based 3D Perception from Multi-frame Point Clouds for Autonomous Driving

dc.contributor.author	Huang, Chengjie
dc.date.accessioned	2025-05-13T19:29:35Z
dc.date.available	2025-05-13T19:29:35Z
dc.date.issued	2025-05-13
dc.date.submitted	2025-04-08
dc.description.abstract	3D perception is a critical component of autonomous driving systems, where accurately detecting objects and understanding the surrounding environment is essential for safety. Recent advances in Light Detection and Ranging (LiDAR) technology and deep neural network architectures have enabled state-of-the-art (SOTA) methods to achieve high performance in 3D object detection and segmentation tasks. Many approaches leverage the sequential nature of LiDAR data by aggregating multiple consecutive scans to generate dense multi-frame point clouds. However, the challenges and applications of multi-frame point clouds have not been fully explored. This thesis makes three key contributions to advance the understanding and application of multi-frame point clouds in 3D perception tasks. First, we address the limitations of multi-frame point clouds in 3D object detection. Specifically, we observe that increasing the number of aggregated frames has diminishing returns and even performance degradation, due to objects responding differently to the number of aggregated frames. To overcome this performance trade-off, we propose an efficient adaptive method termed Variable Aggregation Detection (VADet). Instead of aggregating the entire scene using a fixed number of frames, VADet performs aggregation per object, with the number of frames determined by an object's observed properties, such as speed and point density. This adaptive approach reduces the inherent trade-offs of fixed aggregation, improving detection accuracy. Next, we tackle the challenge of applying multi-frame point cloud to 3D semantic segmentation. Point-wise prediction on dense multi-frame point clouds can be computationally expensive, especially for SOTA transformer-based architectures. To address this issue, we propose MFSeg, an efficient multi-frame 3D semantic segmentation framework. MFSeg aggregates point cloud sequences at the feature level and regularizes the feature extraction and aggregation process to reduce computational overhead without compromising accuracy. Additionally, by employing a lightweight MLP-based point decoder, MFSeg eliminates the need to upsample redundant points from past frames, further improving efficiency. Finally, we explore the use of multi-frame point clouds for cross-sensor domain adaptation. Based on the observation that multi-frame point clouds can weaken the distinct LiDAR scan patterns for stationary objects, we propose Stationary Object Aggregation Pseudo-labelling (SOAP) to generate high quality pseudo-labels for 3D object detection in a target domain. In contrast to the current SOTA in-domain practice of aggregating few input frames, SOAP utilizes entire sequences of point clouds to effectively reduce the sensor domain gap.
dc.identifier.uri	https://hdl.handle.net/10012/21724
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	autonomous driving
dc.subject	LiDAR-based perception
dc.subject	3D object detection
dc.subject	3d semantic segmentation
dc.title	LiDAR-based 3D Perception from Multi-frame Point Clouds for Autonomous Driving
dc.type	Doctoral Thesis
uws-etd.degree	Doctor of Philosophy
uws-etd.degree.department	David R. Cheriton School of Computer Science
uws-etd.degree.discipline	Computer Science
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0
uws.contributor.advisor	Czarnecki, Krzysztof
uws.contributor.affiliation1	Faculty of Mathematics
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Huang_Chengjie.pdf
Size:: 16.66 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Computer Science