An improved hybrid algorithm for multiple change-point detection in array CGH data

G. Y. Sofronov, T. V. Polushina, M. W. Jayawardana

    Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

    Abstract

    A human genome is highly structured. Usually, the structure forms regions having patterns of a specific property. It is well-known that analysis of biological sequences is often confronted with measurements for the gene expression levels. When these observations are ordered by their location on the genome, the values form clouds with different observed means, supposedly reflecting different mean levels. The statistical analysis of these sequences aims at finding chromosomal regions with “abnormal” (increased o r decreased) mean levels. Therefore, identifying genomic regions associated with systematic aberrations provides insights into the initiation and progression of a disease, and improves the diagnosis, prognosis and therapy strategies.
    In this paper, we present a further extension of our work, where we propose a two-staged hybrid algorithm to identify structural patterns in genomic sequences. At the first stage of the algorithm, an e fficient sequential change-point detection procedure (for example, the Shiryaev-Roberts procedure or the cumulative sum control chart (CUSUM) procedure) is applied. Then the obtained locations of the change-points are used to initialize the Cross-Entropy (CE) algorithm, which is an evolutionary stochastic optimization method that estimates both the number of change-points and their corresponding locations. The first-stage of the algorithm is very sensitive for the thresholds selection, and the identification of optimal thresholds will increase the accuracy of the results and further improve the efficiency of the a lgorithm. In this study, we propose an improved hybrid algorithm for change-point detection, which uses optimal thresholds for the sequential change-point detection procedure and the CE method to obtain more precised estimates. In order to illustrate the usefulness of the algorithm, we have performed a comparison of the proposed hybrid algorithms for both artificially generated data and real aCGH experimental data. Our results show that the proposed methodologies are effective in detecting multiple change-points in biological sequences.
    Original languageEnglish
    Title of host publication22nd International Congress on Modelling and Simulation
    EditorsG. Syme, D. Hatton MacDonald, B. Fulton, J. Piantadosi
    Place of Publicationmssanz.org.au
    PublisherModelling & Simulation Society Australia & New Zealand
    Pages508-514
    Number of pages7
    ISBN (Electronic)9780987214379
    Publication statusPublished - 2017
    EventInternational Congress on Modelling and Simulation (22nd : 2017) - Hobart, Australia
    Duration: 3 Dec 20178 Dec 2017

    Conference

    ConferenceInternational Congress on Modelling and Simulation (22nd : 2017)
    Country/TerritoryAustralia
    CityHobart
    Period3/12/178/12/17

    Keywords

    • Change-point detection
    • aCGH microarray data
    • CNVs
    • DNA copy number
    • combinatorial optimization
    • Cross-Entropy method

    Fingerprint

    Dive into the research topics of 'An improved hybrid algorithm for multiple change-point detection in array CGH data'. Together they form a unique fingerprint.

    Cite this