×

Fixed effects testing in high-dimensional linear mixed models. (English) Zbl 1452.62491

Summary: Many scientific and engineering challenges – ranging from pharmacokinetic drug dosage allocation and personalized medicine to marketing mix (4Ps) recommendations – require an understanding of the unobserved heterogeneity to develop the best decision making-processes. In this article, we develop a hypothesis test and the corresponding \(p\)-value for testing for the significance of the homogeneous structure in linear mixed models. A robust matching moment construction is used for creating a test that adapts to the size of the model sparsity. When unobserved heterogeneity at a cluster level is constant, we show that our test is both consistent and unbiased even when the dimension of the model is extremely high. Our theoretical results rely on a new family of adaptive sparse estimators of the fixed effects that do not require consistent estimation of the random effects. Moreover, our inference results do not require consistent model selection. We showcase that moment matching can be extended to nonlinear mixed effects models and to generalized linear mixed effects models. In numerical and real data experiments, we find that the developed method is extremely accurate, that it adapts to the size of the underlying model and is decidedly powerful in the presence of irrelevant covariates.

MSC:

62J05 Linear regression; mixed models
62F03 Parametric hypothesis testing
62H15 Hypothesis testing in multivariate analysis

Software:

MMS

References:

[1] Anderson, T. W., An Introduction to Multivariate Statistical Analysis (1984), New York: Wiley, New York · Zbl 0651.62041
[2] Athey, S.; Imbens, G. W.; Wager, S., Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions, arXiv (2016)
[3] Belloni, A.; Chernozhukov, V.; Chetverikov, D.; Wei, Y., Uniformly Valid Post-Regularization Confidence Regions for Many Functional Parameters in z-Estimation Framework, arXiv (2015)
[4] Belloni, A.; Chernozhukov, V.; Fernández-Val, I.; Hansen, C., “Program Evaluation and Causal Inference With High-Dimensional Data, Econometrica, 85, 233-298 (2017) · Zbl 1410.62197 · doi:10.3982/ECTA12723
[5] Belloni, A.; Chernozhukov, V.; Kato, K., “Uniform Post-Selection Inference for Least Absolute Deviation Regression and Other z-Estimation Problems, Biometrika, 102, 77-94 (2014) · Zbl 1345.62049 · doi:10.1093/biomet/asu056
[6] Bickel, P. J.; Ritov, Y.; Tsybakov, A. B., “Simultaneous Analysis of Lasso and Dantzig Selector, The Annals of Statistics, 37, 1705-1732 (2009) · Zbl 1173.62022 · doi:10.1214/08-AOS620
[7] Bonnet, A.; Gassiat, E.; Lévy-Leduc, C., “Heritability Estimation in High Dimensional Sparse Linear Mixed Models, Electronic Journal of Statistics, 9, 2099-2129 (2015) · Zbl 1337.62157 · doi:10.1214/15-EJS1069
[8] Breslow, N. E.; Clayton, D. G., “Approximate Inference in Generalized Linear Mixed Models, Journal of the American Statistical Association, 88, 9-25 (1993) · Zbl 0775.62195 · doi:10.1080/01621459.1993.10594284
[9] Bühlmann, P.; Van, De; Geer, S., Statistics for High-Dimensional Data: Methods, Theory and Applications (2011), Berlin, Heidelberg: Springer Science & Business Media, Berlin, Heidelberg · Zbl 1273.62015
[10] Cai, T. T.; Guo, Z., “Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity, The Annals of Statistics, 45, 615-646 (2017) · Zbl 1371.62045 · doi:10.1214/16-AOS1461
[11] Chernozhukov, V.; Chetverikov, D.; Demirer, M.; Duflo, E.; Hansen, C.; Newey, W. K., Double Machine Learning for Treatment and Causal Parameters, arXiv (2016)
[12] Chernozhukov, V.; Hansen, C.; Spindler, M., “Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach, Annual Review of Economics, 7, 649-688 (2015) · doi:10.1146/annurev-economics-012315-015826
[13] Crainiceanu, C. M.; Ruppert, D., “Likelihood Ratio Tests in Linear Mixed Models With One Variance Component, Journal of the Royal Statistical Society, Series B, 66, 165-185 (2004) · Zbl 1061.62027 · doi:10.1111/j.1467-9868.2004.00438.x
[14] Fan, J.; Li, R., “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties, Journal of the American statistical Association, 96, 1348-1360 (2001) · Zbl 1073.62547 · doi:10.1198/016214501753382273
[15] Fan, Y.; Li, R., “Variable Selection in Linear Mixed Effects Models, The Annals of Statistics, 40, 2043-2068 (2012) · Zbl 1257.62077 · doi:10.1214/12-AOS1028
[16] Ghosh, A.; Thoresen, M., “Non-Concave Penalization in Linear Mixed-Effects Models and Regularized Selection of Fixed Effects,”, arXiv (2016)
[17] Goeman, J. J.; Van Houwelingen, H. C.; Finos, L., “Testing Against a High-Dimensional Alternative in the Generalized Linear Model: Asymptotic Type I Error Control, Biometrika, 98, 381-390 (2011) · Zbl 1215.62068 · doi:10.1093/biomet/asr016
[18] Groll, A.; Tutz, G., “Variable Selection for Generalized Linear Mixed Models by L_1-Penalized Estimation, Statistics and Computing, 24, 137-154 (2014) · Zbl 1325.62139 · doi:10.1007/s11222-012-9359-z
[19] Heagerty, P. J.; Kurland, B. F., “Misspecified Maximum Likelihood Estimates and Generalised Linear Mixed Models, Biometrika, 88, 973 (2001) · Zbl 0986.62060 · doi:10.1093/biomet/88.4.973
[20] Hobert, J. P.; Casella, G., “The Effect of Improper Priors on Gibbs Sampling in Hierarchical Linear Mixed Models, Journal of the American Statistical Association, 91, 1461-1473 (1996) · Zbl 0882.62020 · doi:10.1080/01621459.1996.10476714
[21] Hui, F. K.; Müller, S.; Welsh, A., “Joint Selection in Mixed Models Using Regularized PQL, Journal of the American Statistical Association, 112, 1323-1333 (2017) · doi:10.1080/01621459.2016.1215989
[22] Jankova, J.; Van De Geer, S., “Confidence Intervals for High-Dimensional Inverse Covariance Estimation, Electronic Journal of Statistics, 9, 1205-1229 (2015) · Zbl 1328.62458 · doi:10.1214/15-EJS1031
[23] Javanmard, A.; Montanari, A., “Confidence Intervals and Hypothesis Testing for High-Dimensional Regression,”, Journal of Machine Learning Research, 15, 2869-2909 (2014) · Zbl 1319.62145
[24] Javanmard, A.; Montanari, A., “Confidence Intervals and Hypothesis Testing for High-Dimensional Regression,”, Journal of Machine Learning Research, 15, 2869-2909 (2014) · Zbl 1319.62145
[25] Kenward, M. G.; Roger, J. H., “Small Sample Inference for Fixed Effects From Restricted Maximum Likelihood, Biometrics, 53, 983-997 (1997) · Zbl 0890.62042 · doi:10.2307/2533558
[26] Koenker, R.; Mizera, I., “Convex Optimization in R, Journal of Statistical Software, 60, 5, 1-23 (2014) · doi:10.18637/jss.v060.i05
[27] Lachos, V. H.; Ghosh, P.; Arellano-Valle, R. B., Likelihood Based Inference for Skew-Normal Independent Linear Mixed Models, Statistica Sinica, 20, 303-322 (2010) · Zbl 1186.62071
[28] Lindstrom, M. J.; Bates, D. M., “Newton-Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data, Journal of the American Statistical Association, 83, 1014-1022 (1988) · Zbl 0671.65119 · doi:10.2307/2290128
[29] Litiére, S.; Alonso, A.; Molenberghs, G., “Type I and Type II Error Under Random-Effects Misspecification in Generalized Linear Mixed Models, Biometrics, 63, 1038-1044 (2007) · Zbl 1274.62822 · doi:10.1111/j.1541-0420.2007.00782.x
[30] Mack, M.; van Loon, A. P.; Hohmann, H.-P., “Regulation of Riboflavin Biosynthesis in Bacillus subtilis Is Affected by the Activity of the Flavokinase/Flavin Adenine Dinucleotide Synthetase Encoded byribC, Journal of Bacteriology, 180, 950-955 (1998)
[31] McCulloch, C. E., “Maximum Likelihood Algorithms for Generalized Linear Mixed Models, Journal of the American Statistical Association, 92, 162-170 (1997) · Zbl 0889.62061 · doi:10.1080/01621459.1997.10473613
[32] McGilchrist, C., “Estimation in Generalized Mixed Models, Journal of the Royal Statistical Society, Series B, 56, 61-69 (1994) · Zbl 0800.62433 · doi:10.1111/j.2517-6161.1994.tb01959.x
[33] Mörtl, S.; Fischer, M.; Richter, G.; Tack, J.; Weinkauf, S.; Bacher, A., “Biosynthesis of Riboflavin Lumazine Synthase of Escherichia coli, Journal of Biological Chemistry, 271, 33201-33207 (1996) · doi:10.1074/jbc.271.52.33201
[34] Müller, S.; Scealy, J.; Welsh, A., “Model Selection in Linear Mixed Models, Statistical Science, 28, 135-167 (2013) · Zbl 1331.62364 · doi:10.1214/12-STS410
[35] Neyman, J., Optimal Asymptotic Tests of Composite Statistical Hypotheses, Probability and Statistics, 57, 213 (1959) · Zbl 0104.12602
[36] Ning, Y.; Liu, H., “A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models, The Annals of Statistics, 45, 158-195 (2017) · Zbl 1364.62128 · doi:10.1214/16-AOS1448
[37] Quintana, F. A.; Johnson, W. O.; Waetjen, L. E.; Gold, E. B., “Bayesian Nonparametric Longitudinal Data Analysis, Journal of the American Statistical Association, 111, 1168-1181 (2016) · doi:10.1080/01621459.2015.1076725
[38] Ren, Z.; Sun, T.; Zhang, C.-H.; Zhou, H. H., “Asymptotic Normality and Optimalities in Estimation of Large Gaussian Graphical Models, The Annals of Statistics, 43, 991-1026 (2015) · Zbl 1328.62342 · doi:10.1214/14-AOS1286
[39] Rohart, F.; San-Cristobal, M.; Laurent, B., “Fixed Effects Selection in High Dimensional Linear Mixed Models, Using a Multicycle ECM Algorithm, Computational Statistics and Data Analysis, 80, 209-222 (2014) · Zbl 1506.62156 · doi:10.1016/j.csda.2014.06.022
[40] Rudelson, M.; Zhou, S., “Reconstruction From Anisotropic Random Measurements, IEEE Transactions on Information Theory, 59, 3434-3447 (2013) · Zbl 1364.94158 · doi:10.1109/TIT.2013.2243201
[41] Ryzhov, I. O.; Han, B.; Bradic, J., “Cultivating Disaster Donors Using Data Analytics, Management Science, 62, 849-866 (2016) · doi:10.1287/mnsc.2015.2149
[42] Schelldorfer, J.; Buhlmann, P.; van de Geer, S., Estimation for High-Dimensional Linear Mixed-Effects Models Using \(####\)-Penalization, Scandinavian Journal of Statistics, 38, 197-214 (2011) · Zbl 1246.62161
[43] Song, P. X.-K.; Zhang, P.; Qu, A., Maximum Likelihood Inference in Robust Linear Mixed-Effects Models Using Multivariate t Distributions, Statistica Sinica, 17, 929-943 (2007) · Zbl 1133.62013
[44] Tan, Z.; Roche, K.; Zhou, X.; Mukherjee, S., Scalable Algorithms for Learning High-Dimensional Linear Mixed Models, arXiv (2018)
[45] Tibshirani, R., “Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288 (1996) · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x
[46] Van de Geer, S.; Bühlmann, P.; Ritov, Y.; Dezeure, R., “On Asymptotically Optimal Confidence Regions and Tests for High-Dimensional Models, The Annals of Statistics, 42, 1166-1202 (2014) · Zbl 1305.62259 · doi:10.1214/14-AOS1221
[47] Verbeke, G.; Lesaffre, E., “A Linear Mixed-Effects Model With Heterogeneity in the Random-Effects Population, Journal of the American Statistical Association, 91, 217-221 (1996) · Zbl 0870.62057 · doi:10.1080/01621459.1996.10476679
[48] Verbeke, G.; Molenberghs, G., Linear Mixed Models for Longitudinal Data (2009), New York: Springer, New York · Zbl 1162.62070
[49] Vitreschak, A. G.; Rodionov, D. A.; Mironov, A. A.; Gelfand, M. S., “Regulation of Riboflavin Biosynthesis and Transport Genes in Bacteria by Transcriptional and Translational Attenuation, Nucleic Acids Research, 30, 3141-3151 (2002) · doi:10.1093/nar/gkf433
[50] Vogl, C.; Grill, S.; Schilling, O.; Stülke, J.; Mack, M.; Stolz, J., “Characterization of Riboflavin (Vitamin B2) Transport Proteins From Bacillus subtilis and Corynebacterium glutamicum, Journal of Bacteriology, 189, 7367-7375 (2007) · doi:10.1128/JB.00590-07
[51] Wang, L.; Zhou, J.; Qu, A., “Penalized Generalized Estimating Equations for High-Dimensional Longitudinal Data Analysis, Biometrics, 68, 353-360 (2012) · Zbl 1251.62051 · doi:10.1111/j.1541-0420.2011.01678.x
[52] Zhang, C.-H.; Zhang, S. S., “Confidence Intervals for Low Dimensional Parameters in High Dimensional Linear Models, Journal of the Royal Statistical Society, Series B, 76, 217-242 (2014) · Zbl 1411.62196 · doi:10.1111/rssb.12026
[53] Zhang, D.; Davidian, M., “Linear Mixed Models With Flexible Distributions of Random Effects for Longitudinal Data, Biometrics, 57, 795-802 (2001) · Zbl 1209.62087 · doi:10.1111/j.0006-341X.2001.00795.x
[54] Zhu, Y.; Bradic, J., “Linear Hypothesis Testing in Dense High-Dimensional Linear Models, Journal of the American Statistical Association, 113, 1583-1600 (2018) · Zbl 1409.62139 · doi:10.1080/01621459.2017.1356319
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.