×

Deriving and comparing the distribution for the number of false positives in single step methods to control \(k\)-FWER. (English) Zbl 1228.62089

Summary: In a multiple testing setting, the investigator is faced with choosing a method for controlling the Type I error rate. We derive and compare the exact distribution for the number of false positives under a commonly used distribution. The results from this work can be extended to derive the distribution of other error control quantities, while the conclusions from our simulations can be used to power future studies.

MSC:

62J15 Paired and multiple comparisons; multiple testing
62E15 Exact distribution theory in statistics
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI

References:

[1] Abdi, H., The bonferroni and šidàk corrections for multiple comparisons, (The Encyclopedia of Measurement and Statistics (2007))
[2] Allison, D.; Gadbury, G.; Heo, M.; Fernández, J.; Lee, C.; Prolla, T.; Weindruch, R., A mixture model approach for the analysis of microarray gene expression data, Computational Statistics and Data Analysis, 39, 1, 1-20 (2002) · Zbl 1119.62371
[3] Bapat, R.; Beg, M., Order statistics for nonidentically distributed variables and permanents, Sankhyā: The Indian Journal of Statistics, Series A, 51, 1, 79-93 (1989) · Zbl 0672.62060
[4] Benjamini, Y.; Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society. Series B (Statistical Methodology), 57, 1, 289-300 (1995) · Zbl 0809.62014
[5] Benjamini, Y.; Yekutieli, D., The control of the false discovery rate in multiple testing under dependency, The Annals of Statistics, 29, 4, 1165-1188 (2001) · Zbl 1041.62061
[6] Blenkiron, C.; Goldstein, L.; Thorne, N.; Spiteri, I.; Chin, S.; Dunning, M.; Barbosa-Morais, N.; Teschendorff, A.; Green, A.; Ellis, I., MicroRNA expression profiling of human breast cancer identifies new markers of tumor subtype, Genome Biology, 8, 10, R214 (2007)
[7] Cai, G.; Sarkar, S., Modified Simes’ critical values under positive dependence, Journal of Statistical Planning and Inference, 136, 12, 4129-4146 (2006) · Zbl 1099.62075
[8] Casella, G.; Berger, R., Statistical Inference (2002)
[9] Chen, J.; van der Laan, M.; Smith, M.; Hubbard, A., A comparison of methods to control type I errors in microarray studies, Statistical Applications in Genetics and Molecular Biology, 6, 1 (2007), Article 28 · Zbl 1166.62334
[10] Dudoit, S.; van der Laan, M.; Pollard, K., Multiple testing. Part I. Single-step procedures for control of general type I error rates, Statistical Applications in Genetics and Molecular Biology, 3, 1 (2004), Article 13 · Zbl 1166.62338
[11] Edgar, R.; Domrachev, M.; Lash, A., Gene expression Omnibus: NCBI gene expression and hybridization array data repository, Nucleic Acids Research, 30, 1, 207-210 (2002)
[12] Efron, B., Correlation and large-scale simultaneous significance testing, Journal of the American Statistical Association, 102, 477, 93-103 (2007) · Zbl 1284.62340
[13] Efron, B.; Tibshirani, R., An Introduction to the Bootstrap (1997), Chapman & Hall
[14] Efron, B., Tibshirani, R., 2006. On testing the significance of sets of genes. Stanford Technical Report.; Efron, B., Tibshirani, R., 2006. On testing the significance of sets of genes. Stanford Technical Report. · Zbl 1129.62102
[15] Efron, B.; Tibshirani, R.; Storey, J.; Tusher, V., Empirical Bayes analysis of a microarray experiment, Journal of the American Statistical Association, 96, 456, 1151-1160 (2001) · Zbl 1073.62511
[16] Ge, Y.; Dudoit, S.; Speed, T., Resampling-based multiple testing for microarray data analysis, TEST, 12, 1, 1-77 (2003) · Zbl 1056.62117
[17] Gentleman, R.; Carey, V.; Huber, W.; Dudoit, S.; Irizarry, R., Bioinformatics and Computational Biology Solutions Using \(R\) and Bioconductor (2005), Springer Verlag · Zbl 1142.62100
[18] Gilbert, H., Pollard, K., van der Laan, M., Dudoit, S., 2009. Resampling-based multiple hypothesis testing with applications to genomics: new developments in the R/bioconductor package multtest. Working Paper 249. UC Berkeley Division of Biostatistics Working Paper Series.; Gilbert, H., Pollard, K., van der Laan, M., Dudoit, S., 2009. Resampling-based multiple hypothesis testing with applications to genomics: new developments in the R/bioconductor package multtest. Working Paper 249. UC Berkeley Division of Biostatistics Working Paper Series.
[19] Glueck, D.; Karimpour-Fard, A.; Mandel, J.; Hunter, L.; Muller, K., Fast computation by block permanents of cumulative distribution functions of order statistics from several populations, Communications in Statistics—Theory and Methods, 37, 18, 2815-2824 (2008) · Zbl 1292.62027
[20] Glueck, D.; Muller, K.; Karimpour-Fard, A.; Hunter, L., Expected power for the false discovery rate with independence, Communications in Statistics—Theory and Methods, 37, 12, 1855-1866 (2008) · Zbl 1140.62059
[21] Gold, D., Miecznikowski, J., 2010. The realized false discovery rate. Technical Report 1004. University at Buffalo, Department of Biostatistics.; Gold, D., Miecznikowski, J., 2010. The realized false discovery rate. Technical Report 1004. University at Buffalo, Department of Biostatistics.
[22] Gold, D.; Miecznikowski, J.; Liu, S., Error control variability in pathway-based microarray analysis, Bioinformatics, 25, 17, 2216-2221 (2009)
[23] Guo, W., A note on adaptive Bonferroni and Holm procedures under dependence, Biometrika, 96, 4, 1012-1018 (2009) · Zbl 1186.62097
[24] Guo, W.; Romano, J., A generalized Sidak-Holm procedure and control of generalized error rates under independence, Statistical Applications in Genetics and Molecular Biology, 6, 1 (2007), Article 3 · Zbl 1166.62316
[25] Hochberg, Y.; Tamhane, A., Multiple Comparison Procedures (1987), Wiley: Wiley New York · Zbl 0731.62125
[26] Holm, S., A simple sequentially rejective multiple test procedure, Scandinavian Journal of Statistics, 6, 2, 65-70 (1979) · Zbl 0402.62058
[27] Lazarou, J.; Pomeranz, B.; Corey, P., Incidence of adverse drug reactions in hospitalized patients a meta-analysis of prospective studies, Journal of the American Medical Association, 279, 15, 1200-1205 (1998)
[28] Leek, J.; Storey, J., Capturing heterogeneity in gene expression studies by surrogate variable analysis, PLoS Genetics, 3, 9, e161 (2007)
[29] Lehmann, E., Testing Statistical Hypotheses (1997), Springer Verlag · Zbl 0862.62020
[30] Lehmann, E.; Romano, J., Generalizations of the familywise error rate, The Annals of Statistics, 33, 3, 1138-1154 (2005) · Zbl 1072.62060
[31] Mecham, B.; Nelson, P.; Storey, J., Supervised normalization of microarrays, Bioinformatics, 26, 10, 1308-1315 (2010)
[32] Miller, R., Simultaneous Statistical Inference (1981), Springer: Springer New York · Zbl 0463.62002
[33] Pollard, K.S., Gilbert, H.N., Ge, Y., Taylor, S., Dudoit, S., 2010. Multtest: resampling-based multiple hypothesis testing. R Package Version 2.3.2.; Pollard, K.S., Gilbert, H.N., Ge, Y., Taylor, S., Dudoit, S., 2010. Multtest: resampling-based multiple hypothesis testing. R Package Version 2.3.2.
[34] Pounds, S.; Morris, S., Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of \(p\)-values, Bioinformatics, 19, 10, 1236-1242 (2003)
[35] Romano, J.; Shaikh, A.; Wolf, M., Control of the false discovery rate under dependence using the bootstrap and subsampling, TEST, 17, 3, 417-442 (2008) · Zbl 1367.62233
[36] Romano, J.; Shaikh, A.; Wolf, M., Rejoinder on: control of the false discovery rate under dependence using the bootstrap and subsampling, TEST, 17, 3, 461-471 (2008) · Zbl 1367.62234
[37] Roquain, E.; Villers, F., Exact calculations for false discovery proportion with application to least favorable configurations, The Annals of Statistics, 39, 1, 584-612 (2011) · Zbl 1209.62164
[38] Rubin, D.; Dudoit, S.; van der Laan, M., A method to increase the power of multiple testing procedures through sample splitting, Statistical Applications in Genetics and Molecular Biology, 5, 1 (2006), Article 19 · Zbl 1166.62318
[39] Sarkar, S., Some probability inequalities for ordered MTP2 random variables: a proof of the Simes conjecture, The Annals of Statistics, 26, 2, 494-504 (1998) · Zbl 0929.62065
[40] Sarkar, S., Some results on false discovery rate in stepwise multiple testing procedures, The Annals of Statistics, 30, 1, 239-257 (2002) · Zbl 1101.62349
[41] Sarkar, S.; Guo, W., On a generalized false discovery rate, The Annals of Statistics, 37, 3, 1545-1565 (2009) · Zbl 1161.62041
[42] Šidàk, Z., Rectangular confidence regions for the means of multivariate normal distributions, Journal of the American Statistical Association, 62, 318, 626-633 (1967) · Zbl 0158.17705
[43] Storey, J., A direct approach to false discovery rates, Journal of the Royal Statistical Society, Series B (Statistical Methodology), 64, 3, 479-498 (2002) · Zbl 1090.62073
[44] Storey, J.; Tibshirani, R., Statistical significance for genomewide studies, Proceedings of the National Academy of Sciences, 100, 16, 9440-9445 (2003) · Zbl 1130.62385
[45] Subramanian, A.; Tamayo, P.; Mootha, V.; Mukherjee, S.; Ebert, B.; Gillette, M.; Paulovich, A.; Pomeroy, S.; Golub, T.; Lander, E., Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proceedings of the National Academy of Sciences of the United States of America, 102, 43, 15545-15550 (2005)
[46] van der Laan, M.; Dudoit, S.; Pollard, K., Multiple testing. Part II. Step-down procedures for control of the family-wise error rate, Statistical Applications in Genetics and Molecular Biology, 3, 1 (2004), Article 14 · Zbl 1166.62378
[47] Wasserman, L., All of Statistics: A Concise Course in Statistical Inference (2004), Springer Verlag · Zbl 1053.62005
[48] Westfall, P.; Young, S., Resampling-Based Multiple Testing: Examples and Methods for \(p\)-value Adjustment (1993), Wiley-Interscience
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.