Abstract
Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and aerial cameras, UAV-borne optical sensors, and other imaging payloads. These sensing systems deliver large-area coverage with fine ground sampling distance, which magnifies domain shifts between different sensors and acquisition conditions. This work builds upon DeepLabV3+ and proposes complementary improvements at three stages: input, context, and decoder fusion. First, to mitigate the interference of complex and heterogeneous data distributions on network optimization, a feature-mapping network is introduced to project raw images into a simpler distribution before they are fed into the segmentation backbone. This approach facilitates training and enhances feature separability. Second, although the Atrous Spatial Pyramid Pooling (ASPP) aggregates multi-scale context, it remains insufficient for modeling long-range dependencies. Therefore, a routing-style global modeling module is incorporated after ASPP to strengthen global relation modeling and ensure cross-region semantic consistency. Third, considering that the fusion between shallow details and deep semantics in the decoder is limited and prone to boundary blurring, a fusion module is designed to facilitate deep interaction and joint learning through cross-layer feature alignment and coupling. The proposed model improves the mean Intersection over Union (mIoU) by 8.83% on the LoveDA dataset and by 6.72% on the ISPRS Potsdam dataset compared to the baseline. Qualitative results further demonstrate clearer boundaries and more stable region annotations, while the proposed modules are plug-and-play and easy to integrate into camera-based remote sensing pipelines and other imaging-sensor systems, providing a practical accuracy–efficiency trade-off.
| Original language | English |
|---|---|
| Article number | 1008 |
| Pages (from-to) | 1-20 |
| Number of pages | 20 |
| Journal | Sensors |
| Volume | 26 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 1 Feb 2026 |
Bibliographical note
Copyright the Author(s) 2026. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- remote sensing
- semantic segmentation
- DeepLabV3+
- multi-level feature fusion
- global context modeling
Fingerprint
Dive into the research topics of 'LGD-DeepLabV3+: an enhanced framework for remote sensing semantic segmentation via multi-level feature fusion and global modeling'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver