×

\(F\)-test and \(z\)-test for high-dimensional regression models with a factor structure. (English) Zbl 07602438

Summary: The classic \(F\)-test and \(z\)-test can fail for high-dimensional regression models. This paper addresses this problem, especially for the case where the covariates contain a latent factor structure. We first use a new technique, the cross-section averages (CSA) of covariates, to estimate the latent factors. We then develop two \(F\)-type tests, namely, the Wald test and the \(F\)-test, to assess the overall significance of covariates. If the covariates are tested jointly significant, we next carry out a CSA-based \(z\)-test to sequentially test the significance of covariates one at a time. Compared with the existing approaches in the literature, which often use principal component analysis (PCA) to estimate the latent factors, the new tests do not depend on the accurate estimation of the unknown degrees of freedom, or on the acquisition of unknown eigenvalues. Therefore, they can reduce the uncertainty arising from the estimation of unknown quantities. We show the power and model selection consistency of these tests and propose a follow-up ratio-type test to further control the model size. Numerical simulations and a real data analysis show the competitive performance of these CSA-based tests.

MSC:

62-XX Statistics
62F03 Parametric hypothesis testing

Software:

FAMT
Full Text: DOI

References:

[1] Zhong, PS; Chen, SX., Tests for high-dimensional regression coefficients with factorial designs, J Am Stat Assoc, 106, 260-274 (2011) · Zbl 1396.62110
[2] Lan, W.; Wang, H.; Tsai, CL., Testing covariates in high-dimensional regression, Ann Inst Stat Math, 66, 279-301 (2014) · Zbl 1334.62113
[3] Goeman, JJ; Geer, SAVD; Houwelingen, HCV., Testing against a high-dimensional alternative in the generalized linear model: asymptotic type I error control, Biometrika, 98, 381-390 (2011) · Zbl 1215.62068
[4] Goeman, JJ; Van De Geer, SA; Van Houwelingen, HC., Testing against a high dimensional alternative, J R Stat Soc Ser B, 68, 477-493 (2006) · Zbl 1110.62002
[5] Cho, H.; Fryzlewicz, P., High dimensional variable selection via tilting, J R Stat Soc Ser B, 74, 593-622 (2012) · Zbl 1411.62183
[6] Lan, W.; Zhong, PS; Li, R., Testing a single regression coefficient in high dimensional linear models, J Econometrics, 195, 154-168 (2016) · Zbl 1443.62198
[7] Lan, W.; Ding, Y.; Fang, Z., Testing covariates in high dimension linear regression with latent factors, J Multivar Anal, 144, 25-37 (2016) · Zbl 1328.62097
[8] Wang, H., Factor profiled sure independence screening, Biometrika, 99, 15-28 (2012) · Zbl 1234.62108
[9] Fan, J.; Lv, J., Sure independence screening for ultrahigh dimensional feature space, J R Stat Soc Ser B, 70, 849-911 (2008) · Zbl 1411.62187
[10] Lan, W.; Du, L., A factor-adjusted multiple testing procedure with application to mutual fund selection, J Bus Econ Stat, 37, 147-157 (2019)
[11] Friguet, C.; Kloareg, M.; Causeur, D., A factor model approach to multiple testing under dependence, J Am Stat Assoc, 104, 1406-1415 (2009) · Zbl 1205.62071
[12] Fan, J.; Ke, Y.; Wang, K., Factor-adjusted regularized model selection, J Econometrics, 216, 71-85 (2020) · Zbl 1456.62114
[13] Ahn, SC; Horenstein, AR., Eigenvalue ratio test for the number of factors, Econometrica, 81, 1203-1227 (2013) · Zbl 1274.62403
[14] Alessi, L.; Barigozzi, M.; Capasso, M., Improved penalization for determining the number of factors in approximate factor models, Stat Probab Lett, 80, 1806-1813 (2010) · Zbl 1202.62081
[15] Bai, J.; Ng, S., Determining the number of factors in approximate factor models, Econometrica, 70, 191-221 (2002) · Zbl 1103.91399
[16] Chen, M.; Tan, X.; Wu, J., Time varying factor models with possibly strongly correlated noises, J Appl Stat, 48, 887-906 (2021) · Zbl 1521.62282
[17] Wu, J., Robust determination for the number of common factors in the approximate factor models, Econom Lett, 144, 102-106 (2016) · Zbl 1398.62378
[18] Pesaran, MH., Estimation and inference in large heterogeneous panels with a multifactor error structure, Econometrica, 74, 967-1012 (2006) · Zbl 1152.91718
[19] Chen, M.; Yan, J., Unbiased CCE estimator for interactive fixed effects panels, Econom Lett, 175, 1-4 (2019) · Zbl 1410.62080
[20] Chudik, A.; Pesaran, MH., Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors, J Econometrics, 188, 393-420 (2015) · Zbl 1337.62354
[21] Karabiyik, H.; Reese, S.; Westerlund, J., On the role of the rank condition in CCE estimation of factor-augmented panel regressions, J Econometrics, 197, 60-64 (2017) · Zbl 1443.62474
[22] Westerlund, J.; Urbain, JP., Cross-sectional averages versus principal components, J Econometrics, 185, 372-377 (2015) · Zbl 1331.62488
[23] Zhang, CH; Zhang, SS., Confidence intervals for low dimensional parameters in high dimensional linear models, J R Stat Soc Ser B, 76, 217-242 (2014) · Zbl 1411.62196
[24] Bailey, N.; Holly, S.; Pesaran, MH., A two-stage approach to spatio-temporal analysis with strong and weak cross-sectional dependence, J Appl Econometrics, 31, 249-280 (2016)
[25] Chen, M., A self-reliant projected information criterion for the number of factors, Commun Stat Theory Methods, 49, 2466-2484 (2020) · Zbl 1511.62131
[26] Chen, M., Tests for the explanatory power of latent factors, Statist Papers, 62, 2825-2856 (2021) · Zbl 1483.62148
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.