×

Empirical priors for prediction in sparse high-dimensional linear regression. (English) Zbl 1519.62014

Summary: In this paper we adopt the familiar sparse, high-dimensional linear regression model and focus on the important but often overlooked task of prediction. In particular, we consider a new empirical Bayes framework that incorporates data in the prior in two ways: one is to center the prior for the non-zero regression coefficients and the other is to provide some additional regularization. We show that, in certain settings, the asymptotic concentration of the proposed empirical Bayes posterior predictive distribution is very fast, and we establish a Bernstein-von Mises theorem which ensures that the derived empirical Bayes prediction intervals achieve the targeted frequentist coverage probability. The empirical prior has a convenient conjugate form, so posterior computations are relatively simple and fast. Finally, our numerical results demonstrate the proposed method’s strong finite-sample performance in terms of prediction accuracy, uncertainty quantification, and computation time compared to existing Bayesian methods.

MSC:

62J05 Linear regression; mixed models
62F15 Bayesian inference

References:

[1] Felix Abramovich and Vadim Grinshtein. MAP model selection in Gaussian regression. Electron. J. Stat., 4:932-949, 2010. ISSN 1935-7524. · Zbl 1329.62051
[2] Ery Arias-Castro and Karim Lounici. Estimation and variable selection with exponential weights.Electron. J. Stat., 8(1):328-354, 2014. ISSN 1935-7524. · Zbl 1294.62164
[3] E. Belitser and S. Ghosal. Empirical Bayes oracle uncertainty quantification.Ann. Statist., to appear,http://www4.stat.ncsu.edu/ ghoshal/papers/oracle_regression.pdf, 2019.
[4] Anindya Bhadra, Jyotishka Datta, Yunfan Li, Nicholas G. Polson, and Brandon Willard. Prediction risk for the horseshoe regression.J. Mach. Learn. Res., 20:Paper No. 78, 39, 2019. ISSN 1532-4435. · Zbl 1489.62196
[5] Anirban Bhattacharya, Debdeep Pati, Natesh S. Pillai, and David B. Dunson. DirichletLaplace priors for optimal shrinkage.J. Amer. Statist. Assoc., 110(512):1479-1490, 2015. ISSN 0162-1459. · Zbl 1373.62368
[6] David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. Variational inference: a review for statisticians.J. Amer. Statist. Assoc., 112(518):859-877, 2017. ISSN 0162-1459.
[7] Peter B¨uhlmann and Sara van de Geer.Statistics for High-Dimensional Data. Springer Series in Statistics. Springer, Heidelberg, 2011. ISBN 978-3-642-20191-2. · Zbl 1273.62015
[8] Carlos M. Carvalho, Nicholas G. Polson, and James G. Scott. The horseshoe estimator for sparse signals.Biometrika, 97(2):465-480, 2010. ISSN 0006-3444. · Zbl 1406.62021
[9] Isma¨el Castillo, Johannes Schmidt-Hieber, and Aad van der Vaart. Bayesian linear regression with sparse priors.Ann. Statist., 43(5):1986-2018, 2015. ISSN 0090-5364. · Zbl 1486.62197
[10] Jianqing Fan and Runze Li. Variable selection via nonconcave penalized likelihood and its oracle properties.J. Amer. Statist. Assoc., 96(456):1348-1360, 2001. ISSN 0162-1459. · Zbl 1073.62547
[11] Subhashis Ghosal and Aad van der Vaart.Fundamentals of Nonparametric Bayesian Inference, volume 44 ofCambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2017. ISBN 978-0-521-87826-5. · Zbl 1376.62004
[12] P. Ghosh and A. Chakrabarti. Posterior concentration properties of a general class of shrinkage estimators around nearly black vectors. Unpublished manuscript,arXiv:1412.8161, 2015.
[13] Peter Gr¨unwald and Thijs van Ommen. Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it.Bayesian Anal., 12(4):1069-1103, 2017. ISSN 1936-0975. · Zbl 1384.62088
[14] Rajarshi Guhaniyogi and David B. Dunson. Bayesian compressed regression.J. Amer. Statist. Assoc., 110(512):1500-1514, 2015. ISSN 0162-1459. · Zbl 1373.62100
[15] Trevor Hastie and Brad Efron.lars: Least Angle Regression, Lasso and Forward Stagewise, 2013. URLhttps://CRAN.R-project.org/package=lars. R package version 1.2.
[16] C. C. Holmes and S. G. Walker. Assigning a value to a power likelihood in a general Bayesian model.Biometrika, 104(2):497-503, 2017. ISSN 0006-3444. · Zbl 1506.62264
[17] L. Hong, T. A. Kuffner, and R. Martin. On overfitting and post-selection uncertainty assessments.Biometrika, 105(1):221-224, 2018. · Zbl 07072407
[18] Wenxin Jiang. Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities.Ann. Statist., 35(4):1487-1511, 2007. ISSN 0090-5364. · Zbl 1123.62026
[19] Nicole Kraemer and Juliane Schaefer.parcor: Regularized estimation of partial correlation matrices, 2014. URLhttps://cran.r-project.org/package=parcor. R package version 0.2-6.
[20] Kim-Anh Le Cao, Florian Rohart, Ignacio Gonzalez, and Sebastien Dejean.mixOmics: Omics Data Integration Project, 2016. URLhttps://CRAN.R-project.org/package= mixOmics. R package version 6.1.1.
[21] Kyoungjae Lee, Jaeyong Lee, and Lizhen Lin. Minimax posterior convergence rates and model selection consistency in high-dimensional DAG models based on sparse Cholesky factors.Ann. Statist., 47(6):3413-3437, 2019. ISSN 0090-5364. · Zbl 1435.62037
[22] Hannes Leeb. The distribution of a linear predictor after model selection: unconditional finite-sample distributions and asymptotic approximations. InOptimality, volume 49 of IMS Lecture Notes Monogr. Ser., pages 291-311. Inst. Math. Statist., Beachwood, OH, 2006. · Zbl 1268.62064
[23] Hannes Leeb. Conditional predictive inference post model selection.Ann. Statist., 37(5B): 2838-2876, 2009. ISSN 0090-5364. · Zbl 1173.62026
[24] Chang Liu and Ryan Martin. An empiricalG-Wishart prior for sparse high-dimensional Gaussian graphical models. Unpublished manuscript,arXiv:1912.03807, 2019.
[25] Chang Liu, Ryan Martin, and Weining Shen. Empirical priors and posterior concentration in a piecewise polynomial sequence model. In preparation, 2020a.
[26] Chang Liu, Yue Yang, Howard Bondell, and Ryan Martin. Bayesian inference in highdimensional linear models using an empirical correlation-adaptive prior.Statist. Sinica, to appeararXiv:1810.00739, 2020b.
[27] R. Martin and S. G. Walker. Data-dependent priors and their posterior concentration rates. Electron. J. Stat., 13(2):3049-3081, 2019. · Zbl 1429.62148
[28] Ryan Martin and Bo Ning. Empirical priors and coverage of posterior credible sets in a sparse normal mean model.Sankhya A, to appear;arXiv:1812.02150, 2020.
[29] Ryan Martin and Stephen G. Walker. Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector.Electron. J. Stat., 8(2):2188-2206, 2014. · Zbl 1302.62015
[30] Ryan Martin, Raymond Mess, and Stephen G. Walker. Empirical Bayes posterior concentration in sparse high-dimensional linear models.Bernoulli, 23(3):1822-1847, 2017. ISSN 1350-7265. · Zbl 1450.62085
[31] Gourab Mukherjee and Iain M. Johnstone. Exact minimax estimation of the predictive density in sparse Gaussian models.Ann. Statist., 43(3):937-961, 2015. ISSN 0090-5364. · Zbl 1328.62058
[32] Benedikt M. P¨otscher and Hannes Leeb. On the distribution of penalized maximum likelihood estimators: the LASSO, SCAD, and thresholding.J. Multivariate Anal., 100(9): 2065-2082, 2009. ISSN 0047-259X. · Zbl 1170.62046
[33] Koylan Ray and Botond Szab´o. Variational Bayes for high-dimensional linear regression with sparse priors. Unpublished manuscript,arXiv:1904.07150, 2019.
[34] S. Reid, R. Tibshirani, and J. Friedman. A study of error variance estimation in lasso regression. Unpublished manuscript,arXiv:1311.5274, 2014.
[35] N. Syring and R. Martin. Calibrating general posterior credible regions.Biometrika, 106 (2):479-486, 2019. · Zbl 1454.62105
[36] Yiqi Tang and Ryan Martin.ebreg: An empirical Bayes method for sparse high-dimensional linear regression, 2020. URLhttps://CRAN.R-project.org/package=ebreg. R package version 0.1.2.
[37] Robert Tibshirani. Regression shrinkage and selection via the lasso.J. Roy. Statist. Soc. Ser. B, 58(1):267-288, 1996. · Zbl 0850.62538
[38] S. van der Pas, J. Scott, A. Chakraborty, and A. Bhattacharya.horseshoe: Implementation of the Horseshoe Prior, 2016. URLhttps://CRAN.R-project.org/package=horseshoe. R package version 0.1.0.
[39] S. van der Pas, B. Szab´o, and A. van der Vaart. Adaptive posterior contraction rates for the horseshoe.Electron. J. Stat., 11(2):3196-3225, 2017a. · Zbl 1373.62140
[40] S. L. van der Pas, B. J. K. Kleijn, and A. W. van der Vaart. The horseshoe estimator: posterior concentration around nearly black vectors.Electron. J. Stat., 8(2):2585-2618, 2014. ISSN 1935-7524. · Zbl 1309.62060
[41] St´ephanie van der Pas, Botond Szab´o, and Aad van der Vaart. Uncertainty quantification for the horseshoe (with discussion).Bayesian Anal., 12(4):1221-1274, 2017b. ISSN 19360975. With a rejoinder by the authors. · Zbl 1384.62155
[42] Nicolas Verzelen.Minimax risks for sparse regressions:ultra-high dimensional phenomenons.Electron. J. Stat., 6:38-90, 2012. ISSN 1935-7524. · Zbl 1334.62120
[43] Yue Yang and Ryan Martin. Empirical priors and variational approximations of the posterior in high-dimensional linear models. In preparation, 2020.
[44] Arnold Zellner. On assessing prior distributions and Bayesian regression analysis withgprior distributions. InBayesian Inference and Decision Techniques, volume 6 ofStud. Bayesian Econometrics Statist., pages 233-243. North-Holland, Amsterdam, 1986. · Zbl 0608.00012
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.