Manipulating the Alpha Level Cannot Cure Significance Testing
When evaluating the strength of the evidence, we should consider auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold is not acceptable.
John Ioannidis discusses the potential effects on clinical research of a 2017 proposal to lower the default P value threshold for statistical significance from .05 to .005 as a means to reduce false-positive findings.
Given that science is the key driver of human progress, improving the efficiency of scientific investigation and yielding more credible and more useful research results can translate to major benefits.
In a profession rewarding productivity in the form of papers and grants, sitting down to deeply read journal articles can feel like wasted time. Professor logs every paper she read over multiple years to gain insight on personal research practices.
Poor research design and data analysis encourage false-positive findings. Such poor methods persist despite perennial calls for improvement, suggesting that they result from something more than just misunderstanding.
Reproducibility: Archive computer code with raw data
Software tools such as knitr and R Markdown allow the description and code of a statistical analysis to be combined into a single document, providing a pipeline from the raw data to the final results and figures. Outputs are updated by re-running the scripts using version-control tools such as Git and GitHub.
Ten Simple Rules for Effective Statistical Practice
A list of 10 rules with researchers in mind: researchers having some knowledge of statistics, possibly with one or more statisticians available in their building, or possibly with a healthy do-it-yourself attitude and a handful of statistical packages on their laptops.