×

Variable selection in multivariate regression models with measurement error in covariates. (English) Zbl 07846373

Summary: Multivariate regression models have been broadly used in analyzing data having multi-dimensional response variables. The use of such models is, however, impeded by the presence of measurement error and spurious variables. While data with such features are common in applications, there has been little work available concerning these features jointly. In this article, we consider variable selection under multivariate regression models with covariates subject to measurement error. To gain flexibility, we allow the dimensions of the covariate and response variables to be either fixed or diverging as the sample size increases. A new regularized method is proposed to handle both variable selection and measurement error effects for error-contaminated data. Our proposed penalized bias-corrected least squares method offers flexibility in selecting the penalty function from a class of functions with different features. Importantly, our method does not require full distributional assumptions for the associated variables, thereby broadening its applicability. We rigorously establish theoretical results and describe a computationally efficient procedure for the proposed method. Numerical studies confirm the satisfactory performance of the proposed method under finite settings, and also demonstrate deleterious effects of ignoring measurement error in inferential procedures.

MSC:

62Hxx Multivariate analysis
62H12 Estimation in multivariate analysis
62F12 Asymptotic properties of parametric estimators
Full Text: DOI

References:

[1] An, Le Thi Hoai; Tao, Pham Dinh, Solving a class of linearly constrained indefinite quadratic problems by DC algorithms, J. Global Optim., 11, 3, 253-285, 1997 · Zbl 0905.90131
[2] Antoniadis, Anestis, Wavelets in statistics: A review, J. Italian Stat. Soc., 6, 2, 97-130, 1997 · Zbl 1454.62113
[3] Carroll, Raymond J.; Ruppert, David; Stefanski, Leonard A.; Crainiceanu, Ciprian M., Measurement Error in Nonlinear Models, 2006, Chapman and Hall/CRC: Chapman and Hall/CRC Boca Raton, Florida · Zbl 1119.62063
[4] Chun, Hyonho; Keleş, Sündüz, Sparse partial least squares regression for simultaneous dimension reduction and variable selection, J. R. Stat. Soc. Ser. B Stat. Methodol., 72, 1, 3-25, 2010 · Zbl 1411.62184
[5] Collobert, Ronan; Sinz, Fabian; Weston, Jason; Bottou, Léon; Joachims, Thorsten, Large scale transductive SVMS, J. Mach. Learn. Res., 7, 8, 1687-1712, 2006 · Zbl 1222.68173
[6] Cui, Jingyu, Multivariate Regression Analysis for Data with Measurement Error, Missing Values, And/or Sparsity Structures, 2023, University of Western Ontario, (Ph.D. thesis)
[7] Fan, Jianqing, Comments on “wavelets in statistics: A review” by A. Antoniadis, J. Italian Stat. Soc., 6, 2, 131-138, 1997 · Zbl 1454.62116
[8] Fan, Jianqing; Li, Runze, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Amer. Statist. Assoc., 96, 456, 1348-1360, 2001 · Zbl 1073.62547
[9] Fan, Jianqing; Peng, Heng, Nonconcave penalized likelihood with a diverging number of parameters, Ann. Statist., 32, 3, 928-961, 2004 · Zbl 1092.62031
[10] Faraway, Julian; Reed, Matthew P., Statistics for digital human motion modeling in ergonomics, Technometrics, 49, 3, 277-290, 2007
[11] Fujita, André; Patriota, Alexandre G.; Sato, Joao R.; Miyano, Satoru, The impact of measurement errors in the identification of regulatory networks, BMC Bioinformatics, 10, 1, 1-19, 2009
[12] Gentle, James E., Matrix Algebra, 2007, Springer: Springer New York · Zbl 1133.15001
[13] Horn, Roger A.; Johnson, Charles R., Matrix Analysis, 2012, Cambridge University Press
[14] Johnson, Richard A.; Wichern, Dean W., Applied Multivariate Statistical Analysis, 2002, Pearson Prentice Hall: Pearson Prentice Hall Upper Saddle River
[15] Kim, Yongdai; Choi, Hosik; Oh, Hee-Seok, Smoothly clipped absolute deviation on high dimensions, J. Amer. Statist. Assoc., 103, 484, 1665-1673, 2008 · Zbl 1286.62062
[16] Lee, Tong Ihn; Rinaldi, Nicola J.; Robert, François; Odom, Duncan T.; Bar-Joseph, Ziv; Gerber, Georg K.; Hannett, Nancy M.; Harbison, Christopher T.; Thompson, Craig M.; Simon, Itamar; Zeitlinger, Julia; Jennings, Ezra G.; Murray, Heather L.; Gordon, D. Benjamin; Ren, Bing; Wyrick, John J.; Tagne, Jean-Bosco; Volkert, Thomas L.; Fraenkel, Ernest; Gifford, David K.; Young, Richard A., Transcriptional regulatory networks in saccharomyces cerevisiae, Science, 298, 5594, 799-804, 2002
[17] Liang, Hua; Li, Runze, Variable selection for partially linear models with measurement errors, J. Amer. Statist. Assoc., 104, 485, 234-248, 2009 · Zbl 1388.62208
[18] Ma, Yanyuan; Li, Runze, Variable selection in measurement error models, Bernoulli, 16, 1, 274-300, 2010 · Zbl 1200.62071
[19] Mukherjee, Ashin; Zhu, Ji, Reduced rank ridge regression and its kernel extensions, Stat. Anal. Data Min., 4, 6, 612-622, 2011 · Zbl 07260307
[20] Olive, David J., Robust Multivariate Analysis, 2017, Springer Nature: Springer Nature Cham, Switzerland · Zbl 1387.62001
[21] Petersen, Kaare Brandt; Pedersen, Michael Syskind, The Matrix Cookbook, 2012, Technical University of Denmark: Technical University of Denmark Denmark
[22] Reinsel, Gregory C.; Velu, Raja P.; Chen, Kun, Multivariate Reduced-Rank Regression: Theory, Methods and Applications, 2022, Springer: Springer New York · Zbl 1515.62007
[23] Rencher, Alvin C., Methods of Multivariate Analysis, 2002, John Wiley & Sons: John Wiley & Sons New York · Zbl 0995.62056
[24] Rothman, Adam J.; Levina, Elizaveta; Zhu, Ji, Sparse multivariate regression with covariance estimation, J. Comput. Graph. Statist., 19, 4, 947-962, 2010
[25] She, Yiyuan; Chen, Kun, Robust reduced-rank regression, Biometrika, 104, 3, 633-647, 2017 · Zbl 07072232
[26] Shen, Xiaotong; Tseng, George C.; Zhang, Xuegong; Wong, Wing Hung, On \(\psi \)-learning, J. Amer. Statist. Assoc., 98, 463, 724-734, 2003 · Zbl 1052.62095
[27] Sofer, Tamar; Dicker, Lee; Lin, Xihong, Variable selection for high dimensional multivariate outcomes, Statist. Sinica, 24, 4, 1633, 2014 · Zbl 1480.62048
[28] Spellman, Paul T.; Sherlock, Gavin; Zhang, Michael Q.; Iyer, Vishwanath R.; Anders, Kirk; Eisen, Michael B.; Brown, Patrick O.; Botstein, David; Futcher, Bruce, Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization, Mol. Biol. Cell, 9, 12, 3273-3297, 1998
[29] Tibshirani, Robert, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 1, 267-288, 1996 · Zbl 0850.62538
[30] Timm, Neil H., Applied Multivariate Analysis, 2002, Springer: Springer New York · Zbl 1002.62036
[31] Yi, Grace Y., Statistical Analysis with Measurement Error Or Misclassification: Strategy, Method and Application, 2017, Springer: Springer New York · Zbl 1377.62012
[32] Yi, Grace Y.; Delaigle, Aurore; Gustafson, Paul, Handbook of Measurement Error Models, 2021, CRC Press: CRC Press Boca Raton, Florida · Zbl 1473.62001
[33] Yuille, Alan L.; Rangarajan, Anand, The concave-convex procedure, Neural Comput., 15, 4, 915-936, 2003 · Zbl 1022.68112
[34] Zhang, Cun-Hui, Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38, 2, 894-942, 2010 · Zbl 1183.62120
[35] Zou, Hui, The adaptive lasso and its oracle properties, J. Amer. Statist. Assoc., 101, 476, 1418-1429, 2006 · Zbl 1171.62326
[36] Zou, Hui; Li, Runze, One-step sparse estimates in nonconcave penalized likelihood models, Ann. Statist., 36, 4, 1509-1533, 2008 · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.