PeptideMind - applying machine learning algorithms to assess replicate quality in shotgun proteomic data

D. C. L. Handler, P. A. Haynes*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Assessment of replicate quality is an important process for any shotgun proteomics experiment. One fundamental question in proteomics data analysis is whether any specific replicates in a set of analyses are biasing the downstream comparative quantitation. In this paper, we present an experimental method to address such a concern. PeptideMind uses a series of clustering Machine Learning algorithms to assess outliers when comparing proteomics data from two states with six replicates each. The program is a JVM native application written in the Kotlin language with Python sub-process calls to scikit-learn. By permuting the six data replicates provided into four hundred triplet non redundant pairwise comparisons, PeptideMind determines if any one replicate is biasing the downstream quantitation of the states. In addition, PeptideMind generates useful visual representations of the spread of the significance measures, allowing researchers a rapid, effective way to monitor the quality of those identified proteins found to be differentially expressed between sample states.

Original languageEnglish
Article number100644
Number of pages6
JournalSoftwareX
Volume13
DOIs
Publication statusPublished - Jan 2021

Bibliographical note

Copyright the Author(s) 2020. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Classification
  • Data quality
  • Data validation
  • False discovery
  • Kotlin
  • Label-free shotgun proteomics
  • Machine learning
  • Protein quantitation
  • Spectral counting
  • Statistics

Fingerprint Dive into the research topics of 'PeptideMind - applying machine learning algorithms to assess replicate quality in shotgun proteomic data'. Together they form a unique fingerprint.

Cite this