A Bayesian change-point approach to nanopore basecalling

S. Shen, G. Sofronov

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

74 Downloads (Pure)

Abstract

Understanding the genetic makeup of organisms is a very important goal in bioinformatics. DNA sequencing, the process of determining the order of the nucleotide bases in DNA, can now be performed quickly and cheaply with commercially available devices no bigger than a USB stick. The latest DNA sequencers use nanopore technologies to capture long, repetitive DNA structures with great success, however, the reported reading accuracy needs improving. One main source of error occurs during the basecalling process when raw nanopore signals outputted by the sequencers are being translated into genetic codes. The distinctive feature of basecalling lies in that not only do the nanopore signals need to be segmented, but they also need be grouped into four types, each representing a genetic code. In this paper, we propose a novel basecalling algorithm using change-point detection methods and Markov chain Monte Carlo (MCMC) sampling techniques. We use real and simulated data to demonstrate the effectiveness of the proposed algorithm.
Original languageEnglish
Title of host publicationMODSIM 2023
Subtitle of host publicationProceedings of the 25th International Congress on Modelling and Simulation
EditorsJai Vaze, Chris Chilcott, Lindsay Hutley, Susan M. Cuddy
Place of PublicationCanberra
PublisherModelling and Simulation Society of Australia and New Zealand Inc. (MSSANZ)
Pages855-861
Number of pages7
ISBN (Electronic)9780987214300
DOIs
Publication statusPublished - 2023
EventThe 25th International Congress on Modelling and Simulation - Darwin, Australia
Duration: 9 Jul 202314 Jul 2023
Conference number: 25th
https://mssanz.org.au/modsim2023/

Conference

ConferenceThe 25th International Congress on Modelling and Simulation
Abbreviated titleMODSIM2023
Country/TerritoryAustralia
CityDarwin
Period9/07/2314/07/23
Internet address

Bibliographical note

Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Nanopore data
  • DNA sequencing
  • segmentation algorithm
  • Markov chain Monte Carlo
  • changepoint detection
  • change-point detection

Fingerprint

Dive into the research topics of 'A Bayesian change-point approach to nanopore basecalling'. Together they form a unique fingerprint.

Cite this