Skip to main navigation Skip to search Skip to main content

Magslam: multi-modal adaptive generator-based semantic SLAM for enhanced robustness in dynamic environments

Lei Zhang*, Xiaohan Yu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The increasing deployment of robot navigation and mapping systems in autonomous driving, intelligent manufacturing, and indoor services highlights the growing importance of visual simultaneous localization and mapping (SLAM) technologies. However, traditional SLAM methods that rely on geometric feature matching often suffer from poor performance in dynamic indoor environments due to unstable feature correspondences caused by moving objects. These limitations lead to degraded localization accuracy and, in extreme cases, system failure. To address these challenges, this paper presents a novel adaptive RGB-D SLAM algorithm that integrates multi-modal generator-based semantic segmentation. Our approach combines RGB and depth data using a multi-modal prompt generator (MPG) and a multi-modal feature adapter (MFA) to achieve robust, high-precision segmentation. The segmentation results are further refined using a motion-level initialization and cross-frame propagation mechanism, which effectively filters out dynamic disturbances. By incorporating weighted static constraints in the pose optimization process, our method enhances pose estimation accuracy and overall system robustness. Extensive experiments conducted on public datasets demonstrate that our approach significantly outperforms traditional and state-of-the-art SLAM systems, offering a promising solution for SLAM in dynamic environments.
Original languageEnglish
Article number437
Pages (from-to)1-16
Number of pages16
JournalComplex and Intelligent Systems
Volume11
Issue number10
DOIs
Publication statusPublished - Oct 2025

Bibliographical note

Copyright the Author(s) 2025. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • RGB-D SLAM
  • Multi-modal semantic segmentation
  • Dynamic environments
  • Robust pose optimization

Fingerprint

Dive into the research topics of 'Magslam: multi-modal adaptive generator-based semantic SLAM for enhanced robustness in dynamic environments'. Together they form a unique fingerprint.

Cite this