Automatic landmark localization in 3D medical CT images: few-shot learning through optimized data pre-processing and network design

Yifan Wang*, Thomas Lenarz, Andrej Kral, Samuel John

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

As surgeons increasingly rely on computed tomography (CT) scans, including low-dose cone beam CT, it has become critical to effectively understand and annotate these medical images, not only for improved treatment planning, but also for robotic and implant surgery. However, in clinical practice, accurate localization and annotation of specific landmarks still relies heavily on manual efforts. Consequently, we present an automatic landmark localization pipeline with few-shot learning, specifically designed for 3D CT scans. In this paper, we focus on the cochlea structure with its three landmarks (apex point, basal point, round window center). The pipeline leverages an optimized data processing in conjunction with a novel designed 2D Convolutional Neural Network (CNN) model. To evaluate the effectiveness of our pipeline in a few-shot learning context, we used 31 volumetric CT scans along with their corresponding annotated landmarks, where 20 volumes are reserved for testing exclusively, while with varying quantities volumes for training, ranging from 1 up to 11. A comparative analysis was conducted among models trained on different numbers of CT volumes. We began by using the weighted F1 score to evaluate the landmark classification performance within the extracted 2D sub-images. A model trained with only 5 CT volumes achieves its peak performance with median weighted F1 scores of around 0.99 (apex), 0.985 (basal), 0.98 (round window center), 0.98 (background). By manually providing an initial point placed near the cochlea, the automatic localization for all three landmarks within the 3D CT volume was then accomplished using a sliding window approach. Compared to manually defined ground truth, the 5-volume-trained model attained an average Euclidean distance error of 0.70 mm (apex), 1.15 mm (basal) and 0.84 mm (round window center) on 3D CT volumes from test set. This demonstrates the efficiency and accuracy of this pipeline.

Original languageEnglish
Title of host publicationICBRA 2023 - Proceedings of the 2023 10th International Conference on Bioinformatics Research and Applications
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Pages1-7
Number of pages7
ISBN (Electronic)9798400708152
DOIs
Publication statusPublished - 22 Sept 2023
Externally publishedYes
Event10th International Conference on Bioinformatics Research and Applications, ICBRA 2023 - Barcelona, Spain
Duration: 22 Sept 202324 Sept 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference10th International Conference on Bioinformatics Research and Applications, ICBRA 2023
Country/TerritorySpain
CityBarcelona
Period22/09/2324/09/23

Keywords

  • Artificial neural network
  • Automatic landmark localization
  • CNN
  • Data augmentation
  • Medical image processing

Fingerprint

Dive into the research topics of 'Automatic landmark localization in 3D medical CT images: few-shot learning through optimized data pre-processing and network design'. Together they form a unique fingerprint.

Cite this