Multitask learning for video-based surgical skill assessment

Zhiteng Jian, Wenxi Yue, Qiuxia Wu, Wei Li, Zhiyong Wang, Vincent Lam

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Surgical skill assessment (SSA) plays a vital role in medical systems for reducing intraoperative surgical errors and improving clinical outcomes. To ensure objective and efficient SSA, many automatic video-based SSA methods have been developed. In particular, various deep learning methods have been devised recently by utilising CNN or RNN-based networks for various skill assessment tasks (e.g., skill level prediction). While predicting overall skill levels and assessing detailed attribute-based scores are highly correlated, most existing studies deal with these two tasks separately, without fully exploiting different information sources encoded in a dataset. In contrast, we propose a novel end-to-end multitask learning framework to conduct skill level classification and attribute score regression jointly. Specifically, our network incorporates two branches for the two tasks, which share earlier layers for feature extraction and hold different prediction layers for specific targets. The shared feature extractor is optimised under the supervision of both tasks simultaneously, encouraging the model to consider information from different aspects and their relatedness to learn richer and more generalised features. In addition, since not every part of a surgical video contributes to skill assessment equally, we enhance an existing feature extractor I3D with a novel Spatio-Temporal Channel Attention Module to emphasize important features. Experimental results on the public dataset JIGSAWS show that our proposed network outperforms state-of-the-art models on both skill classification and score regression tasks.

Original languageEnglish
Title of host publication2020 Digital Image Computing
Subtitle of host publicationTechniques and Applications, DICTA 2020
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Electronic)9781728191089
ISBN (Print)9781728191096
DOIs
Publication statusPublished - 29 Nov 2020
Event2020 Digital Image Computing: Techniques and Applications, DICTA 2020 - Melbourne, Australia
Duration: 29 Nov 20202 Dec 2020

Conference

Conference2020 Digital Image Computing: Techniques and Applications, DICTA 2020
CountryAustralia
CityMelbourne
Period29/11/202/12/20

Keywords

  • attention
  • multitask learning
  • surgical skill assessment

Fingerprint

Dive into the research topics of 'Multitask learning for video-based surgical skill assessment'. Together they form a unique fingerprint.

Cite this