TY - JOUR
T1 - Learning deep asymmetric tolerant part representation
AU - Yu, Xiaohan
AU - Zhao, Yang
AU - Gao, Yongsheng
AU - Xiong, Shengwu
PY - 2023/12
Y1 - 2023/12
N2 - Categorization objects at a subordinate level inevitably pose a significant challenge, i.e., interclass difference is very subtle and only exists in a few key parts. Therefore, how to localize these key parts for discriminative visual categorization without requiring expensive pixel-level annotations becomes a core question. To that end, this article introduces a novel asymmetric tolerant part segmentation network (ATP-Net). The ATP-Net simultaneously learns to segment parts and identify objects in an end-to-end manner using only image-level category labels. Given the intrinsic asymmetry property of part alignment, a desirable learning of part segmentation should be capable of incorporating such property. Despite the efforts toward regularizing weakly supervised part segmentation, none of them consider this vital and intrinsic property, i.e., the spatial asymmetry of part alignment. Our work, for the first time, proposes to explicitly characterize the spatial asymmetry of part alignment for visual tasks. We propose a novel asymmetry loss function to guide the part segmentation by encoding the spatial asymmetry of part alignment, i.e., restricting the upper bound of how asymmetric those self-similar parts are to each other in the network learning. Via a comprehensive ablation study, we verify the effectiveness of the proposed ATP-Net in driving the network learning toward semantically meaningful part segmentation and discriminative visual categorization. Consistently, superior/competitive performance is reported on 12 datasets covering crop cultivar classification, plant disease classification, bird/butterfly species classification, large-scale natural image classification, attribute recognition, and landmark localization.
AB - Categorization objects at a subordinate level inevitably pose a significant challenge, i.e., interclass difference is very subtle and only exists in a few key parts. Therefore, how to localize these key parts for discriminative visual categorization without requiring expensive pixel-level annotations becomes a core question. To that end, this article introduces a novel asymmetric tolerant part segmentation network (ATP-Net). The ATP-Net simultaneously learns to segment parts and identify objects in an end-to-end manner using only image-level category labels. Given the intrinsic asymmetry property of part alignment, a desirable learning of part segmentation should be capable of incorporating such property. Despite the efforts toward regularizing weakly supervised part segmentation, none of them consider this vital and intrinsic property, i.e., the spatial asymmetry of part alignment. Our work, for the first time, proposes to explicitly characterize the spatial asymmetry of part alignment for visual tasks. We propose a novel asymmetry loss function to guide the part segmentation by encoding the spatial asymmetry of part alignment, i.e., restricting the upper bound of how asymmetric those self-similar parts are to each other in the network learning. Via a comprehensive ablation study, we verify the effectiveness of the proposed ATP-Net in driving the network learning toward semantically meaningful part segmentation and discriminative visual categorization. Consistently, superior/competitive performance is reported on 12 datasets covering crop cultivar classification, plant disease classification, bird/butterfly species classification, large-scale natural image classification, attribute recognition, and landmark localization.
UR - http://www.scopus.com/inward/record.url?scp=85142839726&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/arc/DP180100958
U2 - 10.1109/TAI.2022.3222644
DO - 10.1109/TAI.2022.3222644
M3 - Article
SN - 2691-4581
VL - 4
SP - 1789
EP - 1801
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 6
ER -