×

Fisher’s combined probability test for high-dimensional covariance matrices. (English) Zbl 07820400

Summary: Testing large covariance matrices is of fundamental importance in statistical analysis with high-dimensional data. In the past decade, three types of test statistics have been studied in the literature: quadratic form statistics, maximum form statistics, and their weighted combination. It is known that quadratic form statistics would suffer from low power against sparse alternatives and maximum form statistics would suffer from low power against dense alternatives. The weighted combination methods were introduced to enhance the power of quadratic form statistics or maximum form statistics when the weights are appropriately chosen. In this article, we provide a new perspective to exploit the full potential of quadratic form statistics and maximum form statistics for testing high-dimensional covariance matrices. We propose a scale-invariant power-enhanced test based on Fisher’s method to combine the \(p\)-values of quadratic form statistics and maximum form statistics. After carefully studying the asymptotic joint distribution of quadratic form statistics and maximum form statistics, we first prove that the proposed combination method retains the correct asymptotic size under the Gaussian assumption, and we also derive a new Lyapunov-type bound for the joint distribution and prove the correct asymptotic size of the proposed method without requiring the Gaussian assumption. Moreover, we show that the proposed method boosts the asymptotic power against more general alternatives. Finally, we demonstrate the finite-sample performance in simulation studies and a real application. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

Software:

ACAT; HDtest

References:

[1] Anderson, T. (2003), An Introduction to Multivariate Statistical Analysis (Vol. 3), New York: Wiley. · Zbl 1039.62044
[2] Arcones, M. A., and Gine, E. (1993), “Limit Theorems for U-Processes,” The Annals of Probability, 21, 1494-1542. DOI: . · Zbl 0789.60031
[3] Bai, Z., Jiang, D., Yao, J.-F., and Zheng, S. (2009), “Corrections to LRT on Large Dimensional Covariance Matrix by RMT,” The Annals of Statistics, 37, 3822-3840. DOI: . · Zbl 1360.62286
[4] Benjamini, Y., and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B, 57, 289-300. DOI: . · Zbl 0809.62014
[5] Bentkus, V. (2005), “A Lyapunov Type Bound in \(R^d\),” Theory of Probability & Its Applications, 49, 311-323. · Zbl 1090.60019
[6] Cai, T., Liu, W., and Xia, Y. (2013), “Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings,” Journal of the American Statistical Association, 108, 265-277. DOI: . · Zbl 06158341
[7] Chang, J., Zhou, W., Zhou, W.-X., and Wang, L. (2017), “Comparing Large Covariance Matrices under Weak Conditions on the Dependence Structure and its Application to Gene Clustering,” Biometrics, 73, 31-41. DOI: . · Zbl 1366.62206
[8] Chen, S. X., and Qin, Y.-L. (2010), “A Two-Sample Test for High-Dimensional Data with Applications to Gene-Set Testing,” The Annals of Statistics, 38, 808-835. DOI: . · Zbl 1183.62095
[9] Chernozhukov, V., Chetverikov, D., and Kato, K. (2013), “Gaussian Approximations and Multiplier Bootstrap for Maxima of Sums of High-Dimensional Random Vectors,” The Annals of Statistics, 41, 2786-2819. DOI: . · Zbl 1292.62030
[10] Chernozhukov, V., Chetverikov, D., and Kato, K. (2017), “Central Limit Theorems and Bootstrap in High Dimensions,” The Annals of Probability, 45, 2309-2352. · Zbl 1377.60040
[11] Chiaretti, S., Li, X., Gentleman, R., Vitale, A., Vignetti, M., Mandelli, F., Ritz, J., and Foa, R. (2004), “Gene Expression Profile of Adult T-Cell Acute Lymphocytic Leukemia Identifies Distinct Subsets of Patients with Different Response to Therapy and Survival,” Blood, 103, 2771-2778. DOI: .
[12] Dudoit, S., Keles, S., and van der Laan, M. J. (2008), “Multiple Tests of Association with Biological Annotation Metadata,”Probability and Statistics: Essays in Honor of David A. Freedman, 2, 153-218. · Zbl 1168.62100
[13] Ezkurdia, I., Juan, D., Rodriguez, J. M., Frankish, A., Diekhans, M., Harrow, J., Vazquez, J., Valencia, A., and Tress, M. L. (2014), “Multiple Evidence Strands Suggest that there may be as few as 19,000 Human Protein-Coding Genes,” Human Molecular Genetics, 23, 5866-5878. DOI: .
[14] Fan, J., Liao, Y., and Yao, J. (2015), “Power Enhancement in High-Dimensional Cross-Sectional Tests,” Econometrica, 83, 1497-1541. DOI: . · Zbl 1410.62201
[15] Fisher, R. A. (1925), Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd. · JFM 51.0414.08
[16] Goldfarb, D., and Iyengar, G. (2003), “Robust Portfolio Selection Problems,” Mathematics of Operations Research, 28, 1-38. DOI: . · Zbl 1082.90082
[17] Haapaniemi, E. M., Kaustio, M., Rajala, H. L., van Adrichem, A. J., Kainulainen, L., Glumoff, V., Doffinger, R., Kuusanmäki, H., Heiskanen-Kosma, T., and Trotta, L. (2015), “Autoimmunity, Hypogammaglobulinemia, Lymphoproliferation, and Mycobacterial Disease in Patients with Activating Mutations in STAT3,” Blood, 125, 639-648. DOI: .
[18] He, Y., Xu, G., Wu, C., and Pan, W. (2021), “Asymptotically Independent U-Statistics in High-Dimensional Testing,” The Annals of Statistics, 49, 154-181. DOI: . · Zbl 1461.62233
[19] Hedges, L. V., and Olkin, I. (2014), Statistical Methods for Meta-Analysis, New York: Academic Press.
[20] Hodge, D. R., Hurt, E. M., and Farrar, W. L. (2005), “The Role of IL-6 and STAT3 in Inflammation and Cancer,” European Journal of Cancer, 41, 2502-2512. DOI: .
[21] Jerez, A., Clemente, M. J., Makishima, H., Koskela, H., LeBlanc, F., Ng, K. P., Olson, T., Przychodzen, B., Afable, M., and Gomez-Segui, I. (2012), “STAT3 Mutations Unify the Pathogenesis of Chronic Lymphoproliferative Disorders of NK Cells and T-Cell Large Granular Lymphocyte Leukemia,” Blood, 120, 3048-3057. DOI: .
[22] Jiang, T., and Yang, F. (2013), “Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions,” The Annals of Statistics, 41, 2029-2074. DOI: . · Zbl 1277.62149
[23] Johnstone, I. M. (2008), “Multivariate Analysis and Jacobi Ensembles: Largest Eigenvalue, Tracy-Widom Limits and Rates of Convergence,” The Annals of Statistics, 36, 2638-2716. DOI: . · Zbl 1284.62320
[24] Li, D., and Xue, L. (2015), “Joint Limiting Laws for High-Dimensional Independence Tests,” arXiv preprint arXiv:1512.08819.
[25] Li, D., Xue, L., and Zou, H. (2018), “Applications of Peter Hall’s Martingale Limit Theory to Estimating and Testing High Dimensional Covariance Matrices,” Statistica Sinica, 28, 2657-2670. DOI: . · Zbl 1406.62057
[26] Li, J., and Chen, S. X. (2012), “Two Sample Tests for High-Dimensional Covariance Matrices,” The Annals of Statistics, 40, 908-940. DOI: . · Zbl 1274.62383
[27] Littell, R. C., and Folks, J. L. (1971), “Asymptotic Optimality of Fisher’s Method of Combining Independent Tests,” Journal of the American Statistical Association, 66, 802-806. DOI: . · Zbl 0229.62011
[28] Littell, R. C., and Folks, J. L. (1973), “Asymptotic Optimality of Fisher’s Method of Combining Independent Tests II,” Journal of the American Statistical Association, 68, 193-194. · Zbl 0259.62022
[29] Liu, Y., Chen, S., Li, Z., Morrison, A. C., Boerwinkle, E., and Lin, X. (2019), “ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies,” The American Journal of Human Genetics, 104, 410-421. DOI: .
[30] Liu, Y., and Xie, J. (2020), “Cauchy Combination Test: A Powerful Test with Analytic p-value Calculation under Arbitrary Dependency Structures,” Journal of the American Statistical Association, 115, 393-402. DOI: . · Zbl 1437.62163
[31] Milner, J. D., Vogel, T. P., Forbes, L., Ma, C. A., Stray-Pedersen, A., Niemela, J. E., Lyons, J. J., Engelhardt, K. R., Zhang, Y., and Topcagic, N. (2015), “Early-onset Lymphoproliferation and Autoimmunity Caused by Germline STAT3 Gain-of-Function Mutations,” Blood, 125, 591-599. DOI: .
[32] Perlman, M. D. (1980), “Unbiasedness of the Likelihood Ratio Tests for Equality of Several Covariance Matrices and Equality of Several Multivariate Normal Populations,” The Annals of Statistics, 8, 247-263. DOI: . · Zbl 0427.62029
[33] Schott, J. R. (2007), “A Test for the Equality of Covariance Matrices when the Dimension is Large Relative to the Sample Sizes,” Computational Statistics & Data Analysis, 51, 6535-6542. · Zbl 1445.62121
[34] Shi, C., Song, R., Chen, Z., and Li, R. (2019), “Linear Hypothesis Testing for High Dimensional Generalized Linear Models,” The Annals of Statistics, 47, 2671-2703. DOI: . · Zbl 1436.62355
[35] Srivastava, M. S., and Yanagihara, H. (2010), “Testing the Equality of Several Covariance Matrices with Fewer Observations than the Dimension,” Journal of Multivariate Analysis, 101, 1319-1329. DOI: . · Zbl 1186.62078
[36] Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A., and Williams, R. M. Jr (1949), The American Soldier: Adjustment During Army Life, Princeton: Princeton University Press.
[37] Sugiura, N., and Nagao, H. (1968), “Unbiasedness of Some Test Criteria for the Equality of One or Two Covariance Matrices,” The Annals of Mathematical Statistics, 39, 1686-1692. DOI: . · Zbl 0197.15901
[38] Tippett, L. H. C. (1931), The Methods of Statistics: An Introduction Mainly for Experimentalists, London: Williams & Norgate Ltd.
[39] Yang, Q., and Pan, G. (2017), “Weighted Statistic in Detecting Faint and Sparse Alternatives for High-Dimensional Covariance Matrices,” Journal of the American Statistical Association, 112, 188-200. DOI: .
[40] Yu, X., Li, D., Xue, L., and Li, R. (2022), “Power-Enhanced Simultaneous Test of High-Dimensional Mean Vectors and Covariance Matrices with Application to Gene-Set Testing,” Journal of the American Statistical Association. DOI: .
[41] Yu, X., Yao, J., and Xue, L. (2019), “Innovated Power Enhancement for Testing Multi-Factor Asset Pricing Models,” Available at SSRN 3809369.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.