Detecting and avoiding likely false-positive findings - a practical guide

Wolfgang Forstmeier*, Eric Jan Wagenmakers, Timothy H. Parker

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

270 Citations (Scopus)
47 Downloads (Pure)

Abstract

Recently there has been a growing concern that many published research findings do not hold up in attempts to replicate them. We argue that this problem may originate from a culture of 'you can publish if you found a significant effect'. This culture creates a systematic bias against the null hypothesis which renders meta-analyses questionable and may even lead to a situation where hypotheses become difficult to falsify. In order to pinpoint the sources of error and possible solutions, we review current scientific practices with regard to their effect on the probability of drawing a false-positive conclusion. We explain why the proportion of published false-positive findings is expected to increase with (i) decreasing sample size, (ii) increasing pursuit of novelty, (iii) various forms of multiple testing and researcher flexibility, and (iv) incorrect P-values, especially due to unaccounted pseudoreplication, i.e. the non-independence of data points (clustered data). We provide examples showing how statistical pitfalls and psychological traps lead to conclusions that are biased and unreliable, and we show how these mistakes can be avoided. Ultimately, we hope to contribute to a culture of 'you can publish if your study is rigorous'. To this end, we highlight promising strategies towards making science more objective. Specifically, we enthusiastically encourage scientists to preregister their studies (including a priori hypotheses and complete analysis plans), to blind observers to treatment groups during data collection and analysis, and unconditionally to report all results. Also, we advocate reallocating some efforts away from seeking novelty and discovery and towards replicating important research findings of one's own and of others for the benefit of the scientific community as a whole. We believe these efforts will be aided by a shift in evaluation criteria away from the current system which values metrics of 'impact' almost exclusively and towards a system which explicitly values indices of scientific rigour.

Original languageEnglish
Pages (from-to)1941-1968
Number of pages28
JournalBiological Reviews
Volume92
Issue number4
DOIs
Publication statusPublished - Nov 2017
Externally publishedYes

Bibliographical note

Copyright the Author(s) 2016. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • confirmation bias
  • HARKing
  • hindsight bias
  • overfitting
  • P-hacking
  • power
  • preregistration
  • replication
  • researcher degrees of freedom
  • Type I error

Fingerprint

Dive into the research topics of 'Detecting and avoiding likely false-positive findings - a practical guide'. Together they form a unique fingerprint.

Cite this