Omixlitminer

a bioinformatics tool for prioritizing biological leads from ‘omics data using literature retrieval and data mining

Pascal Steffen, Jemma Wu, Shubhang Hariharan, Hannah Voss, Vijay Raghunath, Mark P. Molloy*, Hartmut Schlüter

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

Proteomics and genomics discovery experiments generate increasingly large result tables, necessitating more researcher time to convert the biological data into new knowledge. Literature review is an important step in this process and can be tedious for large scale experiments. An informed and strategic decision about which biomolecule targets should be pursued for follow-up experiments thus remains a considerable challenge. To streamline and formalise this process of literature retrieval and analysis of discovery based ‘omics data and as a decision-facilitating support tool for follow-up experiments we present OmixLitMiner, a package written in the computational language R. The tool automates the retrieval of literature from PubMed based on UniProt protein identifiers, gene names and their synonyms, combined with user defined contextual keyword search (i.e., gene ontology based). The search strategy is programmed to allow either strict or more lenient literature retrieval and the outputs are assigned to three categories describing how well characterized a regulated gene or protein is. The category helps to meet a decision, regarding which gene/protein follow-up experiments may be performed for gaining new knowledge and to exclude following already known biomarkers. We demonstrate the tool’s usefulness in this retrospective study assessing three cancer proteomics and one cancer genomics publication. Using the tool, we were able to corroborate most of the decisions in these papers as well as detect additional biomolecule leads that may be valuable for future research.

Original languageEnglish
Article number1374
Pages (from-to)1-11
Number of pages11
JournalInternational Journal of Molecular Sciences
Volume21
Issue number4
DOIs
Publication statusPublished - Feb 2020

    Fingerprint

Bibliographical note

Copyright the Author(s) 2020. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Bioinformatics
  • Data mining
  • Genomics
  • Literature retrieval
  • Mass spectrometry
  • Proteomics

Cite this