In recent years, the rapid advancement of robot technology has propelled embodied intelligence to new heights, enabling intelligent agents to interact with their surroundings, acquire information, and execute tasks autonomously. Embodied intelligent robots, which integrate artificial intelligence into physical entities, are increasingly deployed in diverse fields such as construction sites, security patrols, emergency response, and industrial automation. A critical enabler for these applications is Simultaneous Localization and Mapping (SLAM), a foundational technology that allows robots to navigate unknown environments by estimating their position and constructing maps in real-time. While traditional SLAM algorithms assume static environments, real-world scenarios are often dynamic, with moving objects like pedestrians, vehicles, and animals introducing challenges such as reduced localization accuracy and map quality due to residual artifacts. This review delves into the research on 3D LiDAR SLAM in dynamic environments, focusing on methods to detect and remove dynamic objects, strategies for handling varying degrees of dynamics, evaluation metrics, datasets, and future directions. I will explore how advancements in robot technology are addressing these challenges, emphasizing the integration of semantic segmentation, ray tracing, and visibility-based approaches to enhance robustness.

The core of this review lies in the methodologies for dynamic object removal in LiDAR point clouds. Dynamic objects can severely compromise the performance of SLAM systems by introducing errors in pose estimation and creating ghosting effects in maps. Based on detection principles, I categorize these methods into three main types: semantic segmentation-based, ray tracing-based, and visibility-based approaches. Semantic segmentation methods leverage clustering and deep learning to identify and eliminate dynamic objects like pedestrians and vehicles. For instance, networks such as FlowNet3D and SalsaNext have been developed to segment point clouds by estimating scene flow or using encoder-decoder structures. However, these methods often rely on labeled datasets and may struggle with generalization. Ray tracing techniques, on the other hand, utilize voxel structures like OctoMap to track laser hits and identify transiently occupied cells, effectively removing dynamic points but at high computational costs. Visibility-based methods exploit the principle that closer points along a laser beam are likely dynamic if they occlude farther static points. Algorithms like Removert and RF-LIO employ multi-resolution range images to distinguish and remove dynamic elements, though they may misclassify ground points or fail in occluded scenarios. Throughout this discussion, I will highlight how robot technology is evolving to integrate these methods into practical SLAM frameworks.
To provide a structured overview, I summarize the key dynamic point cloud removal methods in Table 1, which outlines their innovations and limitations. This table serves as a reference for understanding the trade-offs in computational efficiency, accuracy, and applicability in dynamic environments enabled by advances in robot technology.
| Year | Author | Innovation | Limitations |
|---|---|---|---|
| 2009 | Petrovskaya et al. | 2D bounding box modeling | Poor applicability to pedestrians and cyclists |
| 2010 | Shackleton et al. | 3D grid-based spatial segmentation | Unsuitable for mobile 3D LiDAR sensors |
| 2012 | Litomisky et al. | VFH for dynamic cluster separation | Prone to missing outlier dynamic points |
| 2018 | Ruchti et al. | Neural network for dynamic probability estimation | Cannot detect untrained objects |
| 2019 | Liu et al. | End-to-end scene flow estimation | Lacks integration of motion information |
| 2019 | Cortinhal et al. | Uncertainty-aware semantic segmentation | Depends on manually labeled training data |
| 2019 | Milioto et al. | Distance image and CNN fusion | Low accuracy for small targets |
| 2020 | Zhou et al. | Asymmetric residual blocks and dimension decomposition | Unable to achieve real-time performance |
| 2021 | Wang et al. | Spatial attention mechanism for segmentation | Pose deviation in the Z-direction |
| 2022 | Kim et al. | Fusion of motion and semantic features | Difficulty detecting small dynamic objects |
| 2022 | Mersch et al. | Sparse 4D convolution for spatiotemporal features | Relies on high-quality labeled datasets |
| 2022 | Sun et al. | No semantic information required for optimization | Performance drop with additional datasets |
| 2022 | Li et al. | Multi-scale interaction network | Cannot integrate multiple temporal information |
| 2024 | Han et al. | Polar cylindrical balanced random sampling | Performance degradation with distance |
In the context of robot technology, the handling of dynamic objects within SLAM frameworks depends on their dynamic nature. I classify objects into four categories based on mobility: high-dynamic (e.g., moving vehicles), low-dynamic (e.g., temporarily stationary pedestrians), semi-dynamic (e.g., chairs or parked cars), and static objects (e.g., buildings). Correspondingly, SLAM strategies include online real-time processing for high-dynamic objects, offline post-processing for low-dynamic ones, and lifelong SLAM for semi-dynamic objects that change across sessions. Online methods, such as RF-LIO and Dynamic-LIO, use tight coupling with inertial measurement units (IMUs) to remove dynamic points during scan matching, reducing pose drift in real-time. Offline approaches like ERASOR and Removert leverage historical data to refine static maps, albeit with higher latency. Lifelong SLAM, exemplified by frameworks like LT-mapper, continuously updates maps to adapt to environmental changes, ensuring long-term consistency. These strategies underscore the importance of robot technology in enabling autonomous navigation through dynamic scenes.
Evaluation metrics are crucial for assessing the performance of dynamic SLAM algorithms. Common indicators include Absolute Trajectory Error (ATE) and Relative Pose Error (RPE) for localization accuracy, as well as precision, recall, Preservation Rate (PR), and Rejection Rate (RR) for map quality. For example, ATE measures the root mean square error between estimated and ground truth trajectories, calculated as:
$$ATE = \sqrt{\frac{1}{M} \sum_{i=1}^{M} \|\Delta x_i\|^2}$$
where $\Delta x_i = x_i – \Delta R_i \hat{x}_i’$, with $M$ being the number of states, $x_i$ the true pose, $\hat{x}_i’$ the estimated pose, and $\Delta R_i$ the rotation matrix. Similarly, RPE evaluates drift over segments, while PR and RR are defined as:
$$PR = \frac{P_{ss}}{P_{is}} \times 100\%$$
$$RR = \left(1 – \frac{P_{id}}{P_{sd}}\right) \times 100\%$$
where $P_{ss}$ is the number of static points preserved, $P_{is}$ is the initial static points, $P_{id}$ is the dynamic points not removed, and $P_{sd}$ is the total dynamic points. These formulas highlight the quantitative aspects of robot technology in dynamic SLAM validation.
Datasets play a vital role in benchmarking dynamic SLAM algorithms. I summarize commonly used datasets in Table 2, which include KITTI, Semantic-KITTI, NCLT, and others, providing diverse scenarios for testing robot technology in dynamic environments. These datasets offer labeled point clouds and trajectories, facilitating the development of robust SLAM systems.
| Name | Year | Scene | Characteristics |
|---|---|---|---|
| KITTI | 2012 | Outdoor | Multi-traffic environment for robot performance assessment |
| NCLT | 2016 | Indoor + Outdoor | Dynamic objects and long-term changes in complex settings |
| Semantic-KITTI | 2019 | Outdoor | Rich environmental context with semantic labels |
| UrbanLoco | 2020 | Outdoor | Large-scale urban localization in dense scenes |
| UrbanNav | 2021 | Outdoor | Precise positioning with low-cost sensors in urban canyons |
| DOALS | 2021 | Indoor | Dynamic pedestrian changes and related objects |
| Dynablox | 2023 | Indoor + Outdoor | Challenging elements like dynamic objects and weather variations |
| Flatbed | 2024 | Indoor + Outdoor | Multi-sensor data including LiDAR and cameras |
Looking ahead, the future of dynamic LiDAR SLAM in robot technology is poised for significant advancements through deep learning integration, multi-sensor fusion, and lightweight, scalable designs. Deep learning methods, such as 3D object detection networks, can improve the accuracy of dynamic object removal by directly processing point clouds before registration, reducing interference in high-dynamic environments like construction sites. However, challenges remain in detecting small targets, such as workers, which require enhanced feature extraction techniques. Multi-sensor fusion, combining LiDAR with cameras and IMUs, will enhance robustness in complex terrains. For instance, fusing visual data with LiDAR point clouds can provide texture information and improve loop closure, addressing limitations of single-sensor systems. Lightweight algorithms are essential for resource-constrained platforms, focusing on efficient memory usage and real-time performance. Moreover, lifelong SLAM approaches must evolve to handle multi-session mapping, enabling robots to adapt to long-term environmental changes. These trends emphasize the role of robot technology in pushing the boundaries of autonomous navigation.
In conclusion, dynamic LiDAR SLAM is a critical component of embodied intelligent robots, enabling precise localization and mapping in real-world dynamic settings. Through methods like semantic segmentation, ray tracing, and visibility-based removal, along with strategies tailored to object dynamics, robot technology continues to overcome challenges such as residual artifacts and pose errors. Evaluation metrics and datasets provide the foundation for benchmarking, while future directions point toward deeper learning integration, sensor fusion, and scalability. As robot technology advances, these innovations will empower robots to operate autonomously in increasingly complex and dynamic environments, driving progress across various applications. This review underscores the importance of ongoing research to refine SLAM algorithms, ensuring they meet the demands of modern robotics.