TY - GEN
T1 - Classification of neuroblastoma histopathological images using machine learning
AU - Panta, Adhish
AU - Khushi, Matloob
AU - Naseem, Usman
AU - Kennedy, Paul
AU - Catchpoole, Daniel
PY - 2020
Y1 - 2020
N2 - Neuroblastoma is the most common cancer in young children accounting for over 15% of deaths in children due to cancer. Identification of the class of neuroblastoma is dependent on histopathological classification performed by pathologists which are considered the gold standard. However, due to the heterogeneous nature of neuroblast tumours, the human eye can miss critical visual features in histopathology. Hence, the use of computer-based models can assist pathologists in classification through mathematical analysis. There is no publicly available dataset containing neuroblastoma histopathological images. So, this study uses dataset gathered from The Tumour Bank at Kids Research at The Children’s Hospital at Westmead, which has been used in previous research. Previous work on this dataset has shown maximum accuracy of 84%. One main issue that previous research fails to address is the class imbalance problem that exists in the dataset as one class represents over 50% of the samples. This study explores a range of feature extraction and data undersampling and over-sampling techniques to improve classification accuracy. Using these methods, this study was able to achieve accuracy of over 90% in the dataset. Moreover, significant improvements observed in this study were in the minority classes where previous work failed to achieve high level of classification accuracy. In doing so, this study shows importance of effective management of available data for any application of machine learning.
AB - Neuroblastoma is the most common cancer in young children accounting for over 15% of deaths in children due to cancer. Identification of the class of neuroblastoma is dependent on histopathological classification performed by pathologists which are considered the gold standard. However, due to the heterogeneous nature of neuroblast tumours, the human eye can miss critical visual features in histopathology. Hence, the use of computer-based models can assist pathologists in classification through mathematical analysis. There is no publicly available dataset containing neuroblastoma histopathological images. So, this study uses dataset gathered from The Tumour Bank at Kids Research at The Children’s Hospital at Westmead, which has been used in previous research. Previous work on this dataset has shown maximum accuracy of 84%. One main issue that previous research fails to address is the class imbalance problem that exists in the dataset as one class represents over 50% of the samples. This study explores a range of feature extraction and data undersampling and over-sampling techniques to improve classification accuracy. Using these methods, this study was able to achieve accuracy of over 90% in the dataset. Moreover, significant improvements observed in this study were in the minority classes where previous work failed to achieve high level of classification accuracy. In doing so, this study shows importance of effective management of available data for any application of machine learning.
UR - http://www.scopus.com/inward/record.url?scp=85097443010&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-63836-8_1
DO - 10.1007/978-3-030-63836-8_1
M3 - Conference proceeding contribution
SN - 9783030638351
T3 - Lecture Notes in Computer Science
SP - 3
EP - 14
BT - Neural Information Processing
A2 - Yang, Haiqin
A2 - Pasupa, Kitsuchart
A2 - Leung, Andrew Chi-Sing
A2 - Kwok, James T.
A2 - Chan, Jonathan H.
A2 - King, Irwin
PB - Springer, Springer Nature
CY - Cham
T2 - 27th International Conference on Neural Information Processing, ICONIP 2020
Y2 - 18 November 2020 through 22 November 2020
ER -