Statistics in proteomics: a meta-analysis of 100 proteomics papers published in 2019

David C. L. Handler, Paul A. Haynes*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


We randomly selected 100 journal articles published in five proteomics journals in 2019 and manually examined each of them against a set of 13 criteria concerning the statistical analyses used, all of which were based on items mentioned in the journals' instructions to authors. This included questions such as whether a pilot study was conducted and whether false discovery rate calculation was employed at either the quantitation or identification stage. These data were then transformed to binary inputs, analyzed via machine learning algorithms, and classified accordingly, with the aim of determining if clusters of data existed for specific journals or if certain statistical measures correlated with each other. We applied a variety of classification methods including principal component analysis decomposition, agglomerative clustering, and multinomial and Bernoulli naïve Bayes classification and found that none of these could readily determine journal identity given extracted statistical features. Logistic regression was useful in determining high correlative potential between statistical features such as false discovery rate criteria and multiple testing corrections methods, but was similarly ineffective at determining correlations between statistical features and specific journals. This meta-analysis highlights that there is a very wide variety of approaches being used in statistical analysis of proteomics data, many of which do not conform to published journal guidelines, and that contrary to implicit assumptions in the field there are no clear correlations between statistical methods and specific journals.

Original languageEnglish
Pages (from-to)1337-1343
Number of pages7
JournalJournal of the American Society for Mass Spectrometry
Issue number7
Publication statusPublished - 1 Jul 2020

Bibliographical note

Correction to article:


Dive into the research topics of 'Statistics in proteomics: a meta-analysis of 100 proteomics papers published in 2019'. Together they form a unique fingerprint.

Cite this