Domain-specific introduction to machine learning terminology, pitfalls and opportunities in CRISPR-based gene editing

Aidan R. O'brien, Gaetan Burgio, Denis C. Bauer*

*Corresponding author for this work

    Research output: Contribution to journalReview articlepeer-review

    10 Citations (Scopus)
    16 Downloads (Pure)

    Abstract

    The use of machine learning (ML) has become prevalent in the genome engineering space, with applications ranging from predicting target site efficiency to forecasting the outcome of repair events. However, jargon and ML-specific accuracy measures have made it hard to assess the validity of individual approaches, potentially leading to misinterpretation of ML results. This review aims to close the gap by discussing ML approaches and pitfalls in the context of CRISPR gene-editing applications. Specifically, we address common considerations, such as algorithm choice, as well as problems, such as overestimating accuracy and data interoperability, by providing tangible examples from the genome-engineering domain. Equipping researchers with the knowledge to effectively use ML to better design gene-editing experiments and predict experimental outcomes will help advance the field more rapidly.

    Original languageEnglish
    Pages (from-to)308-314
    Number of pages7
    JournalBriefings in Bioinformatics
    Volume22
    Issue number1
    DOIs
    Publication statusPublished - 18 Jan 2021

    Bibliographical note

    Copyright the Author(s) 2020. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

    Keywords

    • CRISPR
    • data mining
    • feature selection
    • genome engineering
    • machine learning

    Fingerprint

    Dive into the research topics of 'Domain-specific introduction to machine learning terminology, pitfalls and opportunities in CRISPR-based gene editing'. Together they form a unique fingerprint.

    Cite this