GAMLSS and extended Cross-Entropy method to detect multiple change-points in DNA read count data

M. W. J. R. Priyadarshana, Georgy Sofronov

    Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

    Abstract

    We model DNA read count data obtained through next generation sequencing (NGS) technologies as a multiple change-point process. This means that the data are divided into di↵erent segments based on the number of hangepoints. Each segment of the process is modeled by utilizing the zero-inflated negative binomial (ZINB), as well as the negative binomial (NB) istribution in the Generalized additive models for location, scale and shape (GAMLSS) framework. It is observed that ZINB and NB based models, fit the data etter than the competing Poisson model, in which the observed read counts are highly overdispersed as well as zero-inflated. Moreover, we have considered incorporating auxiliary information to further improve the change-point modelling process by utilizing the GAMLSS framework. The extended Cross-Entropy (CE) method which uses a four-parameter beta distribution is used to estimate the number of change-points as well as their corresponding genome locations. Furthermore, parallel implementation of the procedure results a significant improvement in total running time, in which the procedures are highly computationally intensive. We apply the proposed methodology to find change-points in DNA read count data obtained through Illumina TruSeq exome capture of patients with celiac disease. Our results suggest that the proposed GAMLSS based CE method is an e↵ective methodology to detect change-points in genome-wide data.
    Original languageEnglish
    Title of host publicationProceedings of the 28th International Workshop on Statistical Modelling
    Editors Muggeo V. M. R, V. Capursi, G. Boscaino, G. Lovison
    Place of PublicationPalermo, Italy
    PublisherUniversità di Palermo
    Pages453-457
    Number of pages5
    Volume1
    ISBN (Print)9788896251478
    Publication statusPublished - 2013
    EventInternational Workshop on Statistical Modelling (28th : 2013) - Palermo, Italy
    Duration: 8 Jul 201312 Jul 2013

    Workshop

    WorkshopInternational Workshop on Statistical Modelling (28th : 2013)
    CityPalermo, Italy
    Period8/07/1312/07/13

    Keywords

    • GAMLSS
    • Cross-Entropy Method
    • Change-Point Modelling
    • Combinatorial Optimization

    Fingerprint Dive into the research topics of 'GAMLSS and extended Cross-Entropy method to detect multiple change-points in DNA read count data'. Together they form a unique fingerprint.

  • Cite this

    Priyadarshana, M. W. J. R., & Sofronov, G. (2013). GAMLSS and extended Cross-Entropy method to detect multiple change-points in DNA read count data. In Muggeo V. M. R, V. Capursi, G. Boscaino, & G. Lovison (Eds.), Proceedings of the 28th International Workshop on Statistical Modelling (Vol. 1, pp. 453-457). Palermo, Italy: Università di Palermo.