Moment based gene set tests
- PMID: 25928861
- PMCID: PMC4419444
- DOI: 10.1186/s12859-015-0571-7
Moment based gene set tests
Abstract
Background: Permutation-based gene set tests are standard approaches for testing relationships between collections of related genes and an outcome of interest in high throughput expression analyses. Using M random permutations, one can attain p-values as small as 1/(M+1). When many gene sets are tested, we need smaller p-values, hence larger M, to achieve significance while accounting for the number of simultaneous tests being made. As a result, the number of permutations to be done rises along with the cost per permutation. To reduce this cost, we seek parametric approximations to the permutation distributions for gene set tests.
Results: We study two gene set methods based on sums and sums of squared correlations. The statistics we study are among the best performers in the extensive simulation of 261 gene set methods by Ackermann and Strimmer in 2009. Our approach calculates exact relevant moments of these statistics and uses them to fit parametric distributions. The computational cost of our algorithm for the linear case is on the order of doing |G| permutations, where |G| is the number of genes in set G. For the quadratic statistics, the cost is on the order of |G|(2) permutations which can still be orders of magnitude faster than plain permutation sampling. We applied the permutation approximation method to three public Parkinson's Disease expression datasets and discovered enriched gene sets not previously discussed. We found that the moment-based gene set enrichment p-values closely approximate the permutation method p-values at a tiny fraction of their cost. They also gave nearly identical rankings to the gene sets being compared.
Conclusions: We have developed a moment based approximation to linear and quadratic gene set test statistics' permutation distribution. This allows approximate testing to be done orders of magnitude faster than one could do by sampling permutations. We have implemented our method as a publicly available Bioconductor package, npGSEA (www.bioconductor.org) .
Figures
Similar articles
-
Fast approximation of small p-values in permutation tests by partitioning the permutations.Biometrics. 2018 Mar;74(1):196-206. doi: 10.1111/biom.12731. Epub 2017 May 18. Biometrics. 2018. PMID: 29542118
-
Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn.Stat Appl Genet Mol Biol. 2010;9:Article39. doi: 10.2202/1544-6115.1585. Epub 2010 Oct 31. Stat Appl Genet Mol Biol. 2010. PMID: 21044043
-
Faster permutation inference in brain imaging.Neuroimage. 2016 Nov 1;141:502-516. doi: 10.1016/j.neuroimage.2016.05.068. Epub 2016 Jun 7. Neuroimage. 2016. PMID: 27288322 Free PMC article.
-
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification.In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. PMID: 26269925 Free Books & Documents. Review.
-
Advantages of permutation (randomization) tests in clinical and experimental pharmacology and physiology.Clin Exp Pharmacol Physiol. 1994 Sep;21(9):673-86. doi: 10.1111/j.1440-1681.1994.tb02570.x. Clin Exp Pharmacol Physiol. 1994. PMID: 7820947 Review.
Cited by
-
Roastgsa: a comparison of rotation-based scores for gene set enrichment analysis.BMC Bioinformatics. 2023 Oct 30;24(1):408. doi: 10.1186/s12859-023-05510-x. BMC Bioinformatics. 2023. PMID: 37904108 Free PMC article.
-
SEMgsa: topology-based pathway enrichment analysis with structural equation models.BMC Bioinformatics. 2022 Aug 17;23(1):344. doi: 10.1186/s12859-022-04884-8. BMC Bioinformatics. 2022. PMID: 35978279 Free PMC article.
-
Patient-derived xenografts undergo mouse-specific tumor evolution.Nat Genet. 2017 Nov;49(11):1567-1575. doi: 10.1038/ng.3967. Epub 2017 Oct 9. Nat Genet. 2017. PMID: 28991255 Free PMC article.
-
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection.Cancer Inform. 2016 Sep 15;15:179-87. doi: 10.4137/CIN.S40043. eCollection 2016. Cancer Inform. 2016. PMID: 27679461 Free PMC article.
-
Bioconductor's EnrichmentBrowser: seamless navigation through combined results of set- & network-based enrichment analysis.BMC Bioinformatics. 2016 Jan 20;17:45. doi: 10.1186/s12859-016-0884-1. BMC Bioinformatics. 2016. PMID: 26791995 Free PMC article.
References
-
- Newton MA, Quintana FA, den Boon JA, Sengupta S, Ahlquist P. Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis. Ann Appl Stat. 2007;1:85–106. doi: 10.1214/07-AOAS104. - DOI
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials