×

High-dimensional latent panel quantile regression with an application to asset pricing. (English) Zbl 1539.62096

Summary: We propose a generalization of the linear panel quantile regression model to accommodate both sparse and dense parts: sparse means that while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approximated by latent factors and their loadings. Such a structure poses problems for traditional sparse estimators, such as the \({\ell_1}\)-penalized quantile regression, and for traditional latent factor estimators such as PCA. We propose a new estimation procedure, based on the ADMM algorithm, that consists of combining the quantile loss function with \({\ell_1}\) and nuclear norm regularization. We show, under general conditions, that our estimator can consistently estimate both the nonzero coefficients of the covariates and the latent low-rank matrix. This is done in a challenging setting that allows for temporal dependence, heavy-tail distributions and the presence of latent factors.
Our proposed model has a “Characteristics + Latent Factors” Quantile Asset Pricing Model interpretation: we apply our model and estimator with a large-dimensional panel of financial data and find that (i) characteristics have sparser predictive power once latent factors were controlled and (ii) the factors and coefficients at upper and lower quantiles are different from the median.

MSC:

62G08 Nonparametric regression and quantile regression
62H12 Estimation in multivariate analysis
62J05 Linear regression; mixed models
62P05 Applications of statistics to actuarial sciences and financial mathematics

References:

[1] ABREVAYA, J. and DAHL, C. M. (2008). The effects of birth inputs on birthweight: Evidence from quantile estimation on panel data. J. Bus. Econom. Statist. 26 379-397. · doi:10.1198/073500107000000269
[2] ALI, A., KOLTER, Z. and TIBSHIRANI, R. (2016). The multiple quantile graphical model. In Advances in Neural Information Processing Systems 3747-3755.
[3] ANDO, T. and BAI, J. (2020). Quantile co-movement in financial markets: A panel quantile model with unobserved heterogeneity. J. Amer. Statist. Assoc. 115 266-279. · Zbl 1437.62379 · doi:10.1080/01621459.2018.1543598
[4] ARELLANO, M. and BONHOMME, S. (2017). Quantile selection models with an application to understanding changes in wage inequality. Econometrica 85 1-28. · Zbl 1420.91177 · doi:10.3982/ECTA14030
[5] ATHEY, S., BAYATI, M., DOUDCHENKO, N., IMBENS, G. and KHOSRAVI, K. (2018). Matrix completion methods for causal panel data models. J. Amer. Statist. Assoc. 116 1716-1730. · Zbl 1506.15030 · doi:10.1080/01621459.2021.1891924
[6] BAI, J. (2009). Panel data models with interactive fixed effects. Econometrica 77 1229-1279. · Zbl 1183.62196 · doi:10.3982/ECTA6135
[7] BAI, J. and FENG, J. (2019). Robust Principal Components Analysis with Non-Sparse Errors. Preprint. Available at arXiv:1902.08735.
[8] Bai, J. and Li, K. (2012). Statistical analysis of factor models of high dimension. Ann. Statist. 40 436-465. · Zbl 1246.62144 · doi:10.1214/11-AOS966
[9] BAI, J. and NG, S. (2013). Principal components estimation and identification of static factors. J. Econometrics 176 18-29. · Zbl 1284.62350 · doi:10.1016/j.jeconom.2013.03.007
[10] BAI, J. and NG, S. (2017). Principal components and regularized estimation of factor models. Preprint. Available at arXiv:1708.08137.
[11] BAI, J. and NG, S. (2021). Matrix completion, counterfactuals, and factor analysis of missing data. J. Amer. Statist. Assoc. 116 1746-1763. · Zbl 1506.62236 · doi:10.1080/01621459.2021.1967163
[12] BELLONI, A., CHEN, M., MADRID PADILLA, O. H and WANG, Z. (2023). Supplement to “High-dimensional latent panel quantile regression with an application to asset pricing.” https://doi.org/10.1214/22-AOS2223SUPP
[13] BELLONI, A. and CHERNOZHUKOV, V. (2009). On the computational complexity of MCMC-based estimators in large samples. Ann. Statist. 37 2011-2055. · Zbl 1175.65015 · doi:10.1214/08-AOS634
[14] BELLONI, A. and CHERNOZHUKOV, V. (2011). \[{\ell_1}\]-penalized quantile regression in high-dimensional sparse models. Ann. Statist. 39 82-130. · Zbl 1209.62064 · doi:10.1214/10-AOS827
[15] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705-1732. · Zbl 1173.62022 · doi:10.1214/08-AOS620
[16] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1-122. · Zbl 1229.90122
[17] BRAHMA, P. P., SHE, Y., LI, S., LI, J. and WU, D. (2018). Reinforced robust principal component pursuit. IEEE Trans. Neural Netw. Learn. Syst. 29 1525-1538. · doi:10.1109/tnnls.2017.2671849
[18] CAI, J.-F., CANDÈS, E. J. and SHEN, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20 1956-1982. · Zbl 1201.90155 · doi:10.1137/080738970
[19] Candès, E. and Plan, Y. (2010). Matrix completion with noise. Proc. IEEE 98 925-936.
[20] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[21] CANDÈS, E. J. and PLAN, Y. (2011). Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inf. Theory 57 2342-2359. · Zbl 1366.90160 · doi:10.1109/TIT.2011.2111771
[22] Candès, E. J. and Recht, B. (2009). Exact matrix completion via convex optimization. Found. Comput. Math. 9 717-772. · Zbl 1219.90124 · doi:10.1007/s10208-009-9045-5
[23] Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51 1281-1304. · Zbl 0523.90017 · doi:10.2307/1912275
[24] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177-214. · Zbl 1308.62038 · doi:10.1214/14-AOS1272
[25] CHEN, L., DOLADO, J. J. and GONZALO, J. (2021). Quantile factor models. Econometrica 89 875-910. · Zbl 1478.62090 · doi:10.3982/ECTA15746
[26] CHEN, M. (2014). Estimation of nonlinear panel models with multiple unobserved effects. Warwick Economics Research Paper Series No. 1120.
[27] CHEN, M., FERNÁNDEZ-VAL, I. and WEIDNER, M. (2014). Nonlinear panel models with interactive effects. Preprint. Available at arXiv:1412.5647.
[28] CHERNOZHUKOV, V., HANSEN, C. and LIAO, Y. (2017). A lava attack on the recovery of sums of dense and sparse signals. Ann. Statist. 45 39-76. · Zbl 1422.62248 · doi:10.1214/16-AOS1434
[29] CHERNOZHUKOV, V., HANSEN, C., LIAO, Y. and ZHU, Y. (2018). Inference For Heterogeneous Effects Using Low-Rank Estimations. Preprint. Available at arXiv:1812.08089.
[30] COCHRANE, J. H. (2009). Asset Pricing: Revised Edition. Princeton University Press, Princeton. · Zbl 1169.91003
[31] COCHRANE, J. H. (2011). Presidential address: Discount rates. J. Finance 66 1047-1108.
[32] CONNOR, G. and KORAJCZYK, R. A. (1988). Risk and return in an equilibrium APT: Application of a new test methodology. J. Financ. Econ. 21 255-289.
[33] Dalalyan, A. S., Hebiri, M. and Lederer, J. (2017). On the prediction performance of the Lasso. Bernoulli 23 552-581. · Zbl 1359.62295 · doi:10.3150/15-BEJ756
[34] DANIEL, K. and TITMAN, S. (1997). Evidence on the characteristics of cross sectional variation in stock returns. J. Finance 52 1-33.
[35] DANIEL, K. and TITMAN, S. (1998). Characteristics or covariances. J. Portf. Manag. 24 24-33.
[36] DE CASTRO, L. and GALVAO, A. F. (2019). Dynamic quantile models of rational behavior. Econometrica 87 1893-1939. · Zbl 1448.91105 · doi:10.3982/ecta15146
[37] ELSENER, A. and VAN DE GEER, S. (2018). Robust low-rank matrix estimation. Ann. Statist. 46 3481-3509. · Zbl 1412.62068 · doi:10.1214/17-AOS1666
[38] Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33 3-56. · Zbl 1131.91335
[39] FAZEL, M. (2002). Matrix rank minimization with applications.
[40] FENG, G., GIGLIO, S. and XIU, D. (2019). Taming the factor zoo: A test of new factors Technical Report National Bureau of Economic Research.
[41] FENG, J. (2019). Regularized Quantile Regression with Interactive Fixed Effects. Preprint. Available at arXiv:1911.00166.
[42] GALVAO, A. F. and KATO, K. (2016). Smoothed quantile regression for panel data. J. Econometrics 193 92-112. · Zbl 1420.62483 · doi:10.1016/j.jeconom.2016.01.008
[43] GALVAO, A. F. and MONTES-ROJAS, G. V. (2010). Penalized quantile regression for dynamic panel data. J. Statist. Plann. Inference 140 3476-3497. · Zbl 1205.62195 · doi:10.1016/j.jspi.2010.05.008
[44] GALVAO, A. F. JR. (2011). Quantile regression for dynamic panel data with fixed effects. J. Econometrics 164 142-157. · Zbl 1441.62695 · doi:10.1016/j.jeconom.2011.02.016
[45] GIANNONE, D., LENZA, M. and PRIMICERI, G. (2017). Economic predictions with big data: The illusion of sparsity.
[46] GIGLIO, S. and XIU, D. (2018). Asset pricing with omitted factors. Chicago Booth Research Paper 16-21.
[47] GIOVANNETTI, B. C. (2013). Asset pricing under quantile utility maximization. Rev. Financ. Econ. 22 169-179.
[48] GRAHAM, B. S., HAHN, J., POIRIER, A. and POWELL, J. L. (2018). A quantile correlated random coefficients panel data model. J. Econometrics 206 305-335. · Zbl 1452.62912 · doi:10.1016/j.jeconom.2018.06.004
[49] GREEN, J., HAND, J. and ZHANG, F. (2017). The characteristics that provide independent information about average us monthly stock returns. Rev. Financ. Stud. 30 4389-4436.
[50] HAN, Y., HE, A., RAPACH, D. and ZHOU, G. (2018). What Firm Characteristics Drive US Stock Returns? Available at SSRN 3185335.
[51] HE, X., WANG, L. and HONG, H. G. (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist. 41 342-369. · Zbl 1295.62053 · doi:10.1214/13-AOS1087
[52] KATO, K., GALVAO, A. F. JR. and MONTES-ROJAS, G. V. (2012). Asymptotics for panel quantile regression models with individual effects. J. Econometrics 170 76-91. · Zbl 1443.62475 · doi:10.1016/j.jeconom.2012.02.007
[53] KOENKER, R. (2000). Galton, Edgeworth, Frisch, and prospects for quantile regression in econometrics. J. Econometrics 95 347-374. · Zbl 0977.62114 · doi:10.1016/S0304-4076(99)00043-3
[54] KOENKER, R. (2004). Quantile regression for longitudinal data. J. Multivariate Anal. 91 74-89. · Zbl 1051.62059 · doi:10.1016/j.jmva.2004.05.006
[55] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press, Cambridge. · doi:10.1017/CBO9780511754098
[56] KOENKER, R., CHERNOZHUKOV, V., HE, X. and PENG, L. (2017). Handbook of Quantile Regression. CRC Press, Boca Raton.
[57] KOENKER, R. and MACHADO, J. A. F. (1999). Goodness of fit and related inference processes for quantile regression. J. Amer. Statist. Assoc. 94 1296-1310. · Zbl 0998.62041 · doi:10.2307/2669943
[58] Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302-2329. · Zbl 1231.62097 · doi:10.1214/11-AOS894
[59] KOZAK, S., NAGEL, S. and SANTOSH, S. (2019). Shrinking the cross-section. J. Financ. Econ..
[60] LAMARCHE, C. (2010). Robust penalized quantile regression estimation for panel data. J. Econometrics 157 396-408. · Zbl 1431.62161 · doi:10.1016/j.jeconom.2010.03.042
[61] LETTAU, M. and PELGER, M. (2020). Estimating latent asset-pricing factors. J. Econometrics 218 1-31. · Zbl 1456.62252 · doi:10.1016/j.jeconom.2019.08.012
[62] MA, S., LINTON, O. and GAO, J. (2021). Estimation and inference in semiparametric quantile factor models. J. Econometrics 222 295-323. · Zbl 1471.62332 · doi:10.1016/j.jeconom.2020.07.003
[63] MADRID PADILLA, O. H. and CHATTERJEE, S. (2022). Risk bounds for quantile trend filtering. Biometrika 109 751-768. · Zbl 07582650 · doi:10.1093/biomet/asab045
[64] MANSKI, C. F. (1988). Ordinal utility models of decision making under uncertainty. Theory and Decision 25 79-104. · doi:10.1007/BF00129169
[65] MOON, H. R. and WEIDNER, M. (2015). Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83 1543-1579. · Zbl 1410.62126 · doi:10.3982/ECTA9382
[66] MOON, H. R. and WEIDNER, M. (2018). Nuclear norm regularized estimation of panel regression models. Preprint. Available at arXiv:1810.10987.
[67] Negahban, S. and Wainwright, M. J. (2011). Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Statist. 39 1069-1097. · Zbl 1216.62090 · doi:10.1214/10-AOS850
[68] Recht, B., Fazel, M. and Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52 471-501. · Zbl 1198.90321 · doi:10.1137/070697835
[69] ROHDE, A. and TSYBAKOV, A. B. (2011). Estimation of high-dimensional low-rank matrices. Ann. Statist. 39 887-930. · Zbl 1215.62056 · doi:10.1214/10-AOS860
[70] ROSS, S. A. (1976). The arbitrage theory of capital asset pricing. J. Econom. Theory 13 341-360. · doi:10.1016/0022-0531(76)90046-6
[71] ROSTEK, M. (2010). Quantile maximization in decision theory. Rev. Econ. Stud. 77 339-371. · Zbl 1189.91053 · doi:10.1111/j.1467-937X.2009.00564.x
[72] SAGNER, A. G. (2019). Three essays on quantile factor analysis Ph.D. thesis Boston Univ.
[73] SHE, Y. and CHEN, K. (2017). Robust reduced-rank regression. Biometrika 104 633-647. · Zbl 07072232 · doi:10.1093/biomet/asx032
[74] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[75] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics. Springer, New York. · Zbl 0862.60002 · doi:10.1007/978-1-4757-2545-2
[76] WANG, L., WU, Y. and LI, R. (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Amer. Statist. Assoc. 107 214-222. · Zbl 1328.62468 · doi:10.1080/01621459.2012.656014
[77] Wong, K. C., Li, Z. and Tewari, A. (2020). Lasso guarantees for \(β\)-mixing heavy-tailed time series. Ann. Statist. 48 1124-1142. · Zbl 1450.62117 · doi:10.1214/19-AOS1840
[78] WU, Y. and YIN, G. (2015). Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102 65-76. · Zbl 1345.62097 · doi:10.1093/biomet/asu068
[79] YU, B. (1994).Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab. 22 94-116. · Zbl 0802.60024
[80] Yu, Y., Wang, T. and Samworth, R. J. (2015). A useful variant of the Davis-Kahan theorem for statisticians. Biometrika 102 315-323. · Zbl 1452.15010 · doi:10.1093/biomet/asv008
[81] ZHENG, Q., PENG, L. and HE, X. (2015). Globally adaptive quantile regression with ultra-high dimensional data. Ann. Statist. 43 2225-2258 · Zbl 1327.62424 · doi:10.1214/15-AOS1340
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.