×

Quantile graphical models: a Bayesian approach. (English) Zbl 1498.62114

Summary: Graphical models are ubiquitous tools to describe the interdependence between variables measured simultaneously such as large-scale gene or protein expression data. Gaussian graphical models (GGMs) are well-established tools for probabilistic exploration of dependence structures using precision matrices and they are generated under a multivariate normal joint distribution. However, they suffer from several shortcomings since they are based on Gaussian distribution assumptions. In this article, we propose a Bayesian quantile based approach for sparse estimation of graphs. We demonstrate that the resulting graph estimation is robust to outliers and applicable under general distributional assumptions. Furthermore, we develop efficient variational Bayes approximations to scale the methods for large data sets. Our methods are applied to a novel cancer proteomics data dataset where-in multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology.

MSC:

62H22 Probabilistic graphical models
68T05 Learning and adaptive systems in artificial intelligence

Software:

HdBCS

References:

[1] Rehan Akbani, Patrick Kwok Shing Ng, Henrica MJ Werner, Maria Shahmoradgoli, Fan Zhang, Zhenlin Ju, Wenbin Liu, Ji-Yeon Yang, Kosuke Yoshihara, Jun Li, et al. A pancancer proteomic perspective on the cancer genome atlas.Nature communications, 5(1): 1-15, 2014.
[2] Joshua Angrist, Victor Chernozhukov, and Iv´an Fern´andez-Val. Quantile regression under misspecification, with an application to the us wage structure.Econometrica, 74(2): 539-563, 2006. · Zbl 1145.62399
[3] Aliye Atay-Kayis and H´el‘ene Massam. A monte carlo method for computing the marginal likelihood in nondecomposable gaussian graphical models.Biometrika, 92(2):317-335, 2005. · Zbl 1094.62028
[4] John Barnard, Robert McCulloch, and Xiao-Li Meng. Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage.Statistica Sinica, pages 1281-1311, 2000. · Zbl 0980.62045
[5] Matthew James Beal et al.Variational algorithms for approximate Bayesian inference. university of London London, 2003.
[6] Alexandre Belloni, Mingli Chen, and Victor Chernozhukov. Quantile graphical models: Prediction and conditional independence with applications to financial risk management. Technical report, 2016.
[7] Jos´e M Bernardo. Expected information as expected utility.the Annals of Statistics, pages 686-690, 1979. · Zbl 0407.62002
[8] Stephen P Brooks, Paolo Giudici, and Gareth O Roberts. Efficient construction of reversible jump markov chain monte carlo proposal distributions.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1):3-39, 2003. · Zbl 1063.62120
[9] Victor Chernozhukov and Han Hong. An mcmc approach to classical estimation.Journal of Econometrics, 115(2):293-346, 2003. · Zbl 1043.62022
[10] Arthur P Dempster. Covariance selection.Biometrics, pages 157-175, 1972. 44
[11] Adrian Dobra, Chris Hans, Beatrix Jones, Joseph R Nevins, Guang Yao, and Mike West. Sparse graphical models for exploring gene expression data.Journal of Multivariate Analysis, 90(1):196-212, 2004. · Zbl 1047.62104
[12] Michael Finegold and Mathias Drton. Robust graphical modeling of gene networks using classical and alternative t-distributions.The Annals of Applied Statistics, pages 1057- 1080, 2011. · Zbl 1232.62083
[13] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse inverse covariance estimation with the graphical lasso.Biostatistics, 9(3):432-441, 2008. · Zbl 1143.62076
[14] Nir Friedman. Inferring cellular networks using probabilistic graphical models.Science, 303 (5659):799-805, 2004.
[15] Edward I George and Robert E McCulloch. Variable selection via gibbs sampling.Journal of the American Statistical Association, 88(423):881-889, 1993.
[16] Subhashis Ghosal, Jayanta K Ghosh, Aad W Van Der Vaart, et al. Convergence rates of posterior distributions.Annals of Statistics, 28(2):500-531, 2000. · Zbl 1105.62315
[17] P Giudici. Learning in graphical gaussian models.Bayesian Statistics, 5:621-628, 1996.
[18] Paolo Giudici and PJ Green.Decomposable graphical gaussian model determination. Biometrika, 86(4):785-801, 1999. · Zbl 0940.62019
[19] RA Hilger, ME Scheulen, and D Strumberg. The ras-raf-mek-erk pathway in the treatment of cancer.Oncology Research and Treatment, 25(6):511-518, 2002.
[20] Wenxin Jiang. Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities.Techical Report, Dept. Statistics, Northwestern Univ, 05-02, 2005.
[21] Wenxin Jiang. Bayesian variable selection for high dimensional generalized linear models: convergence rates of the fitted densities.The Annals of Statistics, 35(4):1487-1511, 2007. · Zbl 1123.62026
[22] Bas JK Kleijn, Aad W van der Vaart, et al. Misspecification in infinite-dimensional bayesian statistics.The Annals of Statistics, 34(2):837-877, 2006. · Zbl 1095.62031
[23] R Koenker and G Bassett Jr. Regression quantiles”, econometrica: Journal of the economic society, vol. 46, no. 1, 1978.
[24] Roger Koenker. Quantile regression for longitudinal data.Journal of Multivariate Analysis, 91(1):74-89, 2004. · Zbl 1051.62059
[25] Samuel Kotz and Saralees Nadarajah.Multivariate t-distributions and their applications. Cambridge University Press, 2004. · Zbl 1100.62059
[26] Hideo Kozumi and Genya Kobayashi. Gibbs sampling methods for bayesian quantile regression.Journal of statistical computation and simulation, 81(11):1565-1578, 2011. · Zbl 1431.62018
[27] Lynn Kuo and Bani Mallick. Variable selection for regression models.Sankhy¯a: The Indian Journal of Statistics, Series B, pages 65-81, 1998. · Zbl 0972.62016
[28] Steffen L Lauritzen.Graphical models, volume 17. Clarendon Press, 1996. · Zbl 0907.62001
[29] Qing Li, Ruibin Xi, Nan Lin, et al. Bayesian regularized quantile regression.Bayesian Analysis, 5(3):533-556, 2010. · Zbl 1330.62143
[30] John C Liechty, Merrill W Liechty, and Peter M¨uller. Bayesian correlation estimation. Biometrika, 91(1):1-14, 2004. · Zbl 1132.62314
[31] Han Liu, Fang Han, Ming Yuan, John Lafferty, Larry Wasserman, et al. High-dimensional semiparametric gaussian copula graphical models.The Annals of Statistics, 40(4):2293- 2326, 2012. · Zbl 1297.62073
[32] Bani K Mallick, David Gold, and Veerabhadran Baladandayuthapani.Bayesian analysis of gene expression data, volume 131. Wiley Online Library, 2009.
[33] Nicolai Meinshausen and Peter B¨uhlmann. High-dimensional graphs and variable selection with the lasso.The annals of statistics, 34(3):1436-1462, 2006. · Zbl 1113.62082
[34] Sarah E Neville, John T Ormerod, and MP Wand. Mean field variational bayes for continuous sparse signal shrinkage: pitfalls and remedies.Electronic Journal of Statistics, 8(1): 1113-1151, 2014. · Zbl 1298.62050
[35] Jie Peng, Pei Wang, Nengfeng Zhou, and Ji Zhu. Partial correlation estimation by joint sparse regression models.Journal of the American Statistical Association, 104(486):735- 746, 2009. · Zbl 1388.62046
[36] Alberto Roverato. Cholesky decomposition of a hyper inverse wishart matrix.Biometrika, 87(1):99-112, 2000. · Zbl 0974.62047
[37] Jeong Seon Ryu, Azra Memon, and Seul-Ki Lee. Ercc1 and personalized medicine in lung cancer.Annals of translational medicine, 2(4), 2014.
[38] Juliane Sch¨afer and Korbinian Strimmer. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.Statistical applications in genetics and molecular biology, 4(1), 2005.
[39] James G Scott and James O Berger. Bayes and empirical-bayes multiplicity adjustment in the variable-selection problem.The Annals of Statistics, pages 2587-2619, 2010. · Zbl 1200.62020
[40] James G Scott and Carlos M Carvalho. Feature-inclusion stochastic search for gaussian graphical models.Journal of Computational and Graphical Statistics, 17(4):790-808, 2008.
[41] Eran Segal, Michael Shapira, Aviv Regev, Dana Pe’er, David Botstein, Daphne Koller, and Nir Friedman. Module networks: identifying regulatory modules and their conditionspecific regulators from gene expression data.Nature genetics, 34(2):166-176, 2003.
[42] Karthik Sriram, RV Ramamoorthi, and Pulak Ghosh. Posterior consistency of bayesian quantile regression based on the misspecified asymmetric laplace density.Bayesian Analysis, 8(2):479-504, 2013. · Zbl 1329.62308
[43] Matthew P Wand, John T Ormerod, Simone A Padoan, Rudolf Fr¨uhwirth, et al. Mean field variational bayes for elaborate distributions.Bayesian Analysis, 6(4):847-900, 2011. · Zbl 1330.62158
[44] Frederick Wong, Christopher K Carter, and Robert Kohn. Efficient estimation of covariance selection models.Biometrika, 90(4):809-830, 2003. · Zbl 1436.62346
[45] Yunwen Yang, Huixia Judy Wang, and Xuming He. Posterior inference in bayesian quantile regression with asymmetric laplace likelihood.International Statistical Review, 84(3): 327-344, 2016. · Zbl 07763523
[46] Yasushi Yatabe, Takashi Takahashi, and Tetsuya Mitsudomi. Epidermal growth factor receptor gene amplification is acquired in association with tumor progression of egfrmutated lung cancer.Cancer research, 68(7):2106-2111, 2008.
[47] Ming Yuan and Yi Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19-35, 2007. · Zbl 1142.62408
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.