Skip to main navigation Skip to search Skip to main content

LGD-DeepLabV3+: an enhanced framework for remote sensing semantic segmentation via multi-level feature fusion and global modeling

Xin Wang*, Xu Liu, Adnan Mahmood, Yaxin Yang, Xipeng Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Downloads (Pure)

Abstract

Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and aerial cameras, UAV-borne optical sensors, and other imaging payloads. These sensing systems deliver large-area coverage with fine ground sampling distance, which magnifies domain shifts between different sensors and acquisition conditions. This work builds upon DeepLabV3+ and proposes complementary improvements at three stages: input, context, and decoder fusion. First, to mitigate the interference of complex and heterogeneous data distributions on network optimization, a feature-mapping network is introduced to project raw images into a simpler distribution before they are fed into the segmentation backbone. This approach facilitates training and enhances feature separability. Second, although the Atrous Spatial Pyramid Pooling (ASPP) aggregates multi-scale context, it remains insufficient for modeling long-range dependencies. Therefore, a routing-style global modeling module is incorporated after ASPP to strengthen global relation modeling and ensure cross-region semantic consistency. Third, considering that the fusion between shallow details and deep semantics in the decoder is limited and prone to boundary blurring, a fusion module is designed to facilitate deep interaction and joint learning through cross-layer feature alignment and coupling. The proposed model improves the mean Intersection over Union (mIoU) by 8.83% on the LoveDA dataset and by 6.72% on the ISPRS Potsdam dataset compared to the baseline. Qualitative results further demonstrate clearer boundaries and more stable region annotations, while the proposed modules are plug-and-play and easy to integrate into camera-based remote sensing pipelines and other imaging-sensor systems, providing a practical accuracy–efficiency trade-off.

Original languageEnglish
Article number1008
Pages (from-to)1-20
Number of pages20
JournalSensors
Volume26
Issue number3
DOIs
Publication statusPublished - 1 Feb 2026

Bibliographical note

Copyright the Author(s) 2026. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • remote sensing
  • semantic segmentation
  • DeepLabV3+
  • multi-level feature fusion
  • global context modeling

Fingerprint

Dive into the research topics of 'LGD-DeepLabV3+: an enhanced framework for remote sensing semantic segmentation via multi-level feature fusion and global modeling'. Together they form a unique fingerprint.

Cite this