Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn
- PMID: 21044043
- DOI: 10.2202/1544-6115.1585
Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn
Abstract
Permutation tests are amongst the most commonly used statistical tools in modern genomic research, a process by which p-values are attached to a test statistic by randomly permuting the sample or gene labels. Yet permutation p-values published in the genomic literature are often computed incorrectly, understated by about 1/m, where m is the number of permutations. The same is often true in the more general situation when Monte Carlo simulation is used to assign p-values. Although the p-value understatement is usually small in absolute terms, the implications can be serious in a multiple testing context. The understatement arises from the intuitive but mistaken idea of using permutation to estimate the tail probability of the test statistic. We argue instead that permutation should be viewed as generating an exact discrete null distribution. The relevant literature, some of which is likely to have been relatively inaccessible to the genomic community, is reviewed and summarized. A computation strategy is developed for exact p-values when permutations are randomly drawn. The strategy is valid for any number of permutations and samples. Some simple recommendations are made for the implementation of permutation tests in practice.
Similar articles
-
Moment based gene set tests.BMC Bioinformatics. 2015 Apr 28;16:132. doi: 10.1186/s12859-015-0571-7. BMC Bioinformatics. 2015. PMID: 25928861 Free PMC article.
-
Estimation of false discovery rate using sequential permutation p-values.Biometrics. 2013 Mar;69(1):1-7. doi: 10.1111/j.1541-0420.2012.01825.x. Epub 2013 Feb 4. Biometrics. 2013. PMID: 23379645
-
Fast approximation of small p-values in permutation tests by partitioning the permutations.Biometrics. 2018 Mar;74(1):196-206. doi: 10.1111/biom.12731. Epub 2017 May 18. Biometrics. 2018. PMID: 29542118
-
Advantages of permutation (randomization) tests in clinical and experimental pharmacology and physiology.Clin Exp Pharmacol Physiol. 1994 Sep;21(9):673-86. doi: 10.1111/j.1440-1681.1994.tb02570.x. Clin Exp Pharmacol Physiol. 1994. PMID: 7820947 Review.
-
Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.Prog Brain Res. 2006;158:83-108. doi: 10.1016/S0079-6123(06)58004-5. Prog Brain Res. 2006. PMID: 17027692 Review.
Cited by
-
Individual Burden of Illness Index in Bipolar Disorder Remission: A Cross-Sectional Study.Consort Psychiatr. 2024 Jul 6;5(2):17-30. doi: 10.17816/CP15471. eCollection 2024. Consort Psychiatr. 2024. PMID: 39072003 Free PMC article.
-
Domain general frontoparietal regions show modality-dependent coding of auditory and visual rules.bioRxiv [Preprint]. 2024 Mar 7:2024.03.04.583318. doi: 10.1101/2024.03.04.583318. bioRxiv. 2024. PMID: 38903119 Free PMC article. Preprint.
-
A SIMPLE AND FLEXIBLE TEST OF SAMPLE EXCHANGEABILITY WITH APPLICATIONS TO STATISTICAL GENOMICS.Ann Appl Stat. 2024 Mar;18(1):858-881. doi: 10.1214/23-aoas1817. Epub 2024 Jan 31. Ann Appl Stat. 2024. PMID: 38784669 Free PMC article.
-
Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks.PLoS One. 2024 May 16;19(5):e0302696. doi: 10.1371/journal.pone.0302696. eCollection 2024. PLoS One. 2024. PMID: 38753612 Free PMC article.
-
Demographic bias in misdiagnosis by computational pathology models.Nat Med. 2024 Apr;30(4):1174-1190. doi: 10.1038/s41591-024-02885-z. Epub 2024 Apr 19. Nat Med. 2024. PMID: 38641744
MeSH terms
LinkOut - more resources
Full Text Sources