Abstract
Understanding the genetic makeup of organisms is a very important goal in bioinformatics. DNA sequencing, the process of determining the order of the nucleotide bases in DNA, can now be performed quickly and cheaply with commercially available devices no bigger than a USB stick. The latest DNA sequencers use nanopore technologies to capture long, repetitive DNA structures with great success, however, the reported reading accuracy needs improving. One main source of error occurs during the basecalling process when raw nanopore signals outputted by the sequencers are being translated into genetic codes. The distinctive feature of basecalling lies in that not only do the nanopore signals need to be segmented, but they also need be grouped into four types, each representing a genetic code. In this paper, we propose a novel basecalling algorithm using change-point detection methods and Markov chain Monte Carlo (MCMC) sampling techniques. We use real and simulated data to demonstrate the effectiveness of the proposed algorithm.
Original language | English |
---|---|
Title of host publication | MODSIM 2023 |
Subtitle of host publication | Proceedings of the 25th International Congress on Modelling and Simulation |
Editors | Jai Vaze, Chris Chilcott, Lindsay Hutley, Susan M. Cuddy |
Place of Publication | Canberra |
Publisher | Modelling and Simulation Society of Australia and New Zealand Inc. (MSSANZ) |
Pages | 855-861 |
Number of pages | 7 |
ISBN (Electronic) | 9780987214300 |
DOIs | |
Publication status | Published - 2023 |
Event | The 25th International Congress on Modelling and Simulation - Darwin, Australia Duration: 9 Jul 2023 → 14 Jul 2023 Conference number: 25th https://mssanz.org.au/modsim2023/ |
Conference
Conference | The 25th International Congress on Modelling and Simulation |
---|---|
Abbreviated title | MODSIM2023 |
Country/Territory | Australia |
City | Darwin |
Period | 9/07/23 → 14/07/23 |
Internet address |
Bibliographical note
Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Nanopore data
- DNA sequencing
- segmentation algorithm
- Markov chain Monte Carlo
- changepoint detection
- change-point detection