×

Testing covariates in high dimension linear regression with latent factors. (English) Zbl 1328.62097

Summary: We propose here both F-test and \(z\)-test (or \(t\)-test) for testing global significance and individual effect of each single predictor respectively in high dimension regression model when the explanatory variables follow a latent factor structure H. Wang [Biometrika 99, No. 1, 15–28 (2012; Zbl 1234.62108)]. Under the null hypothesis, together with fairly mild conditions on the explanatory variables and latent factors, we show that the proposed F-test and \(t\)-test are asymptotically distributed as weighted chi-square and standard normal distribution respectively. That leads to quite different test statistics and inference procedures, as compared with that of P.-S. Zhong and S. X. Chen [J. Am. Stat. Assoc. 106, No. 493, 260–274 (2011; Zbl 1396.62110)] when the explanatory variables are weakly dependent. Moreover, based on the \(p\)-value of each predictor, the method of J. D. Storey et al. [J. R. Stat. Soc., Ser. B, Stat. Methodol. 66, No. 1, 187–205 (2004; Zbl 1061.62110)] can be used to implement the multiple testing procedure, and we can achieve consistent model selection as long as we can select the threshold value appropriately. All the results are further supported by extensive Monte Carlo simulation studies. The practical utility of the two proposed tests are illustrated via a real data example for index funds tracking in China stock market.

MSC:

62F03 Parametric hypothesis testing

Software:

pcalg
Full Text: DOI

References:

[1] Bai, J., Inferential theory for factor models of large dimensions, Econometrica, 71, 135-171 (2003) · Zbl 1136.62354
[2] Bai, J.; Ng, S., Determining the number of factors in approximate factor models, Econometrica, 70, 191-221 (2002) · Zbl 1103.91399
[3] Benjamini, Y.; Hochberg, Y., Controlling the false discovery rate: A practical and powerful approach to multiple testing, J. R. Stat. Soc. Ser. B, 57, 289-300 (1995) · Zbl 0809.62014
[4] Fama, E. F.; French, K. R., Common risk factors in the return on stocks and bonds, J. Financ. Econ., 33, 3-56 (1993) · Zbl 1131.91335
[5] Fan, J.; Han, X.; Gu, W., Estimating false discovery proportion under arbitrary covariance dependence, J. Amer. Statist. Assoc., 107, 1019-1035 (2012) · Zbl 1395.62219
[6] Fan, J.; Liao, Y.; Mincheva, M., High dimensional covariance matrix estimation in approximate factor models, Ann. Statist., 39, 3320-3356 (2011) · Zbl 1246.62151
[7] Fan, J.; Lv, J., Sure independence screening for ultra-high dimensional feature space (with discussion), J. R. Stat. Soc. Ser. B, 70, 849-911 (2008) · Zbl 1411.62187
[8] Fan, J.; Song, R., Sure independence screening in generalized linear models with NP-dimensionality, Ann. Statist., 38, 3567-3604 (2010) · Zbl 1206.68157
[9] Goeman, J.; Houwelingen, V.; Finos, L., Testing against a high dimensional alternative in the generalized linear model: asymptotic type I error control, Biometrika, 98, 381-390 (2011) · Zbl 1215.62068
[10] Härdle, W.; Liang, H.; Gao, J., Partially Linear Models (2000), Springer: Springer Heidelberg · Zbl 0968.62006
[11] Kalisch, M.; Buhlmann, P., Estimating high-dimensional directed acyclic graphs with the PC-algorithm, J. Mach. Learn. Res., 8, 613-636 (2007) · Zbl 1222.68229
[12] Lan, W.; Wang, H.; Tsai, C. L., Testing covariates in high dimensional regression, Ann. Inst. Statist. Math., 66, 279-301 (2014) · Zbl 1334.62113
[14] Lyons, R., Strong laws of large numbers for weakly correlated random variables, Michigan Math. J., 35, 353-359 (1988) · Zbl 0684.60025
[15] Meinshausen, N.; Meier, L.; Bühlmann, P., \(P\)-values for high-dimensional regression, J. Amer. Statist. Assoc., 104, 1671-1681 (2009) · Zbl 1205.62089
[16] Sharpe, W. F., Capital asset prices: A theory of market equilibrium under conditions of risk, J. Finance, 19, 425-442 (1964)
[17] Storey, J. D.; Taylor, J. E.; Siegmund, D., Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach, J. R. Stat. Soc. Ser. B, 66, 187-205 (2004) · Zbl 1061.62110
[18] Wang, H., Forward regression for ultra-high dimensional variable screening, J. Amer. Statist. Assoc., 104, 1512-1524 (2009) · Zbl 1205.62103
[19] Wang, H., Factor profiled independence screening, Biometrika, 99, 15-28 (2012) · Zbl 1234.62108
[20] Wasserman, L.; Roeder, K., High dimensional variable selection, Ann. Statist., 37, 2178-2201 (2009) · Zbl 1173.62054
[21] Zhang, C. H.; Zhang, S. S., Confidence intervals for low dimensional parameters in high dimensional linear models, J. R. Stat. Soc. Ser. B, 76, 217-242 (2014) · Zbl 1411.62196
[22] Zhong, P. S.; Chen, S. X., Tests for high dimensional regression coefficients with factorial designs, J. Amer. Statist. Assoc., 106, 260-274 (2011) · Zbl 1396.62110
[23] Zhong, P. S.; Chen, S. X.; Xu, M., Tests alternative to higher criticism for high dimensional means under sparsity and column-wise dependence, Ann. Statist., 41, 2820-2851 (2013) · Zbl 1294.62128
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.