×

Bayesian multi-task variable selection with an application to differential DAG analysis. (English) Zbl 07862304

Summary: We study the Bayesian multi-task variable selection problem, where the goal is to select activated variables for multiple related datasets simultaneously. We propose a new variational Bayes algorithm which generalizes and improves the recently developed “sum of single effects” model of Wang et al. Motivated by differential gene network analysis in biology, we further extend our method to joint structure learning of multiple directed acyclic graphical models, a problem known to be computationally highly challenging. We propose a novel order MCMC sampler where our multi-task variable selection algorithm is used to quickly evaluate the posterior probability of each ordering. Both simulation studies and real gene expression data analysis are conducted to show the efficiency of our method. Finally, we also prove a posterior consistency result for multi-task variable selection, which provides a theoretical guarantee for the proposed algorithms. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

References:

[1] Agrawal, R., Uhler, C., and Broderick, T. (2018), “Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models,” in International Conference on Machine Learning, , pp. 89-98.
[2] Blei, D. M., Kucukelbir, A., and McAuliffe, J. D. (2017), “Variational Inference: A Review for Statisticians,”Journal of the American Statistical Association, 112, 859-877. DOI: .
[3] Bonilla, E. V., Chai, K., and Williams, C. (2007), “Multi-Task Gaussian Process Prediction,” in Advances in Neural Information Processing Systems (Vol. 20).
[4] Carbonetto, P., and Stephens, M. (2012), “Scalable Variational Inference for Bayesian Variable Selection in Regression, and its Accuracy in Genetic Association Studies,” Bayesian Analysis, 7, 73-108. DOI: . · Zbl 1330.62089
[5] Castelletti, F., La Rocca, L., Peluso, S., Stingo, F. C., and Consonni, G. (2020), “Bayesian Learning of Multiple Directed Networks from Observational Data,”Statistics in Medicine, 39, 4745-4766. DOI: .
[6] Chen, X., Sun, H., Ellington, C., Xing, E., and Song, L. (2021), “Multi-Task Learning of Order-Consistent Causal Graphs,” inAdvances in Neural Information Processing Systems (Vol. 34).
[7] Chickering, D. M. (2002), “Optimal Structure Identification with Greedy Search,”Journal of Machine Learning Research, 3, 507-554. · Zbl 1084.68519
[8] Danaher, P., Wang, P., and Witten, D. M. (2014), “The Joint Graphical Lasso for Inverse Covariance Estimation Across Multiple Classes,”Journal of the Royal Statistical Society, Series B, 76, 373-397. DOI: . · Zbl 07555455
[9] Ellis, B., and Wong, W. H. (2008), “Learning Causal Bayesian Network Structures from Experimental Data,”Journal of the American Statistical Association, 103, 778-789. DOI: . · Zbl 1471.62056
[10] Fiers, M. W. E. J., Minnoye, L., Aibar, S., Bravo González-Blas, C., Kalender Atak, Z., and Aerts, S. (2018), “Mapping Gene Regulatory Networks from Single-Cell Omics Data,”Briefings in Functional Genomics, 17, 246-254. DOI: .
[11] Fronk, E.-M., and Giudici, P. (2004), “Markov Chain Monte Carlo Model Selection for DAG Models,”Statistical Methods and Applications, 13, 259-273. DOI: . · Zbl 1175.62005
[12] George, E. I., and McCulloch, R. E. (1993), “Variable Selection via Gibbs Sampling,”Journal of the American Statistical Association, 88, 881-889. DOI: .
[13] Ghoshal, A., Bello, K., and Honorio, J. (2019), “Direct Learning with Guarantees of the Difference DAG between Structural Equation Models,” arXiv preprint arXiv:1906.12024.
[14] Gonçalves, A., Von Zuben, F. J., and Banerjee, A. (2016), “Multi-Task Sparse Structure Learning with Gaussian Copula Models,”The Journal of Machine Learning Research, 17, 1205-1234. · Zbl 1360.68675
[15] Guo, S., Zoeter, O., and Archambeau, C. (2011), “Sparse Bayesian Multi-Task Learning,” inAdvances in Neural Information Processing Systems (Vol. 24).
[16] Harris, N., and Drton, M. (2013), “PC Algorithm for Nonparanormal Graphical Models,”Journal of Machine Learning Research, 14, 3365-3383. · Zbl 1318.62197
[17] Hernández-Lobato, D., Hernández-Lobato, J. M., and Ghahramani, Z. (2015), “A Probabilistic Model for Dirty Multi-Task Feature Selection,” in International Conference on Machine Learning, , pp. 1073-1082, PMLR.
[18] Huang, X., Wang, J., and Liang, F. (2016), “A Variational Algorithm for Bayesian Variable Selection,” arXiv preprint arXiv:1602.07640.
[19] Ishwaran, H., and Rao, J. S. (2005), “Spike and Slab Variable Selection: Frequentist and Bayesian Strategies,”The Annals of Statistics, 33, 730-773. DOI: . · Zbl 1068.62079
[20] Jalali, A., Sanghavi, S., Ruan, C., and Ravikumar, P. (2010), “A Dirty Model for Multi-Task Learning,” inAdvances in Neural Information Processing Systems (Vol. 23).
[21] Jeong, S., and Ghosal, S. (2021), “Unified Bayesian Theory of Sparse Linear Regression with Nuisance Parameters,”Electronic Journal of Statistics, 15, 3040-3111. DOI: . · Zbl 1471.62284
[22] Jiang, W., and Tanner, M. A. (2008), “Gibbs Posterior for Variable Selection in High-Dimensional Classification and Data Mining,”The Annals of Statistics, 36, 2207-2231. DOI: . · Zbl 1274.62227
[23] Johnson, V. E., and Rossell, D. (2012), “Bayesian Model Selection in High-Dimensional Settings,”Journal of the American Statistical Association, 107, 649-660. DOI: . · Zbl 1261.62024
[24] Kalisch, M., Mächler, M., Colombo, D., Maathuis, M. H., and Bühlmann, P. (2012), “Causal Inference Using Graphical Models with the R Package pcalg,” Journal of Statistical Software, 47, 1-26. DOI: .
[25] Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., and Tanabe, M. (2012), KEGG for Integration and Interpretation of Large-Scale Molecular Data Sets,” Nucleic Acids Research, 40, D109-D114. DOI: .
[26] Koller, D., and Friedman, N. (2009), Probabilistic Graphical Models: Principles and Techniques, Cambridge, MA: MIT Press. · Zbl 1183.68483
[27] Kuipers, J., and Moffa, G. (2017), “Partition MCMC for Inference on Acyclic Digraphs,”Journal of the American Statistical Association, 112, 282-299. DOI: .
[28] Kuipers, J., Suter, P., and Moffa, G. (2022), “Efficient Sampling and Structure Learning of Bayesian Networks,”Journal of Computational and Graphical Statistics, 31, 639-650. DOI: . · Zbl 07633197
[29] Lee, K., and Cao, X. (2022), “Bayesian Joint Inference for Multiple Directed Acyclic Graphs,”Journal of Multivariate Analysis, 191, 105003. DOI: . · Zbl 1520.62053
[30] Li, Y., Liu, D., Li, T., and Zhu, Y. (2020), “Bayesian Differential Analysis of Gene Regulatory Networks Exploiting Genetic Perturbations,”BMC Bioinformatics, 21, 1-13. DOI: .
[31] Liu, J., Sun, W., and Liu, Y. (2019), “Joint Skeleton Estimation of Multiple Directed Acyclic Graphs for Heterogeneous Population,”Biometrics, 75, 36-47. DOI: . · Zbl 1436.62592
[32] Lounici, K., Pontil, M., Tsybakov, A. B., and Van De Geer, S. (2009), “Taking Advantage of Sparsity in Multi-Task Learning,” arXiv preprint arXiv:0903.1468.
[33] Lounici, K., Pontil, M., Van De Geer, S., and Tsybakov, A. B. (2011), “Oracle Inequalities and Optimal Inference Under Group Sparsity,”The Annals of Statistics, 39, 2164-2204. DOI: . · Zbl 1306.62156
[34] Meinshausen, N., and Bühlmann, P. (2010), “Stability Selection,”Journal of the Royal Statistical Society, Series B, 72, 417-473. DOI: . · Zbl 1411.62142
[35] Nandy, P., Hauser, A., and Maathuis, M. H. (2018), “High-Dimensional Consistency in Score-based and Hybrid Structure Learning,”The Annals of Statistics, 46, 3151-3183. DOI: . · Zbl 1411.62144
[36] Narisetty, N. N., and He, X. (2014), “Bayesian Variable Selection with Shrinking and Diffusing Priors,”The Annals of Statistics, 42, 789-817. DOI: . · Zbl 1302.62158
[37] Niu, X., Sun, Y., and Sun, J. (2018), “Latent Group Structured Multi-Task Learning,” in 2018 52nd Asilomar Conference on Signals, Systems, and Computers, pp. 850-854, IEEE. DOI: .
[38] Ormerod, J. T., You, C., and Müller, S. (2017), “A Variational Bayes Approach to Variable Selection,”Electronic Journal of Statistics, 11, 3549-3594. DOI: . · Zbl 1384.62240
[39] Oyen, D., and Lane, T. (2012), “Leveraging Domain Knowledge in Multitask Bayesian Network Structure Learning,” inTwenty-Sixth AAAI Conference on Artificial Intelligence.
[40] Peterson, C., Stingo, F. C., and Vannucci, M. (2015), “Bayesian Inference of Multiple Gaussian Graphical Models,”Journal of the American Statistical Association, 110, 159-174. DOI: . · Zbl 1373.62106
[41] Peterson, C. B., and Stingo, F. C. (2021), “Bayesian Estimation of Single and Multiple Graphs,” in Handbook of Bayesian Variable Selection, eds. Tadesse, M. G. and Vannucci, M., pp. 327-348, New York: Chapman and Hall/CRC. · Zbl 07533207
[42] Peterson, C. B., Osborne, N., Stingo, F. C., Bourgeat, P., Doecke, J. D., and Vannucci, M. (2020), “Bayesian Modeling of Multiple Structural Connectivity Networks During the Progression of Alzheimer’s Disease,” Biometrics, 76, 1120-1132. DOI: . · Zbl 1520.62306
[43] Ray, K., and Szabó, B. (2021), “Variational Bayes for High-Dimensional Linear Regression with Sparse Priors,”Journal of the American Statistical Association, 117, 1270-1281. DOI: . · Zbl 1506.62341
[44] Ray, K., and Szabó, B. (2022), “Variational Bayes for High-Dimensional Linear Regression with Sparse Priors,” Journal of the American Statistical Association, 117, 1270-1281. DOI: . · Zbl 1506.62341
[45] Shaddox, E., Peterson, C. B., Stingo, F. C., Hanania, N. A., Cruickshank-Quinn, C., Kechris, K., Bowler, R., and Vannucci, M. (2020), “Bayesian Inference of Networks Across Multiple Sample Groups and Data Types,”Biostatistics, 21, 561-576. DOI: .
[46] Spirtes, P., Glymour, C. N., Scheines, R., and Heckerman, D. (2000), Causation, Prediction, and Search, Cambridge, MA: MIT Press.
[47] Tothill, R. W., Tinker, A. V., George, J., Brown, R., Fox, S. B., Lade, S., Johnson, D. S., Trivett, M. K., Etemadmoghadam, D., Locandro, B., et al. (2008), “Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome,”Clinical Cancer Research, 14, 5198-5208. DOI: .
[48] Van de Sande, B., Flerin, C., Davie, K., De Waegeneer, M., Hulselmans, G., Aibar, S., Seurinck, R., Saelens, W., Cannoodt, R., Rouchon, Q., Verbeiren, T., De Maeyer, D., Reumers, J., Saeys, Y., and Aerts, S. (2020), “A Scalable SCENIC Workflow for Single-Cell Gene Regulatory Network Analysis,”Nature Protocols, 15, 2247-2276. DOI: .
[49] Wang, G., Sarkar, A., Carbonetto, P., and Stephens, M. (2020), “A Simple New Approach to Variable Selection in Regression, with Application to Genetic Fine Mapping,”Journal of the Royal Statistical Society, Series B, 82, 1273-1300. DOI: . · Zbl 07554792
[50] Wang, Y., Segarra, S., and Uhler, C. (2020b), “High-Dimensional Joint Estimation of Multiple Directed Gaussian Graphical Models,”Electronic Journal of Statistics, 14, 2439-2483. DOI: . · Zbl 1445.62046
[51] Yajima, M., Telesca, D., Ji, Y., and Müller, P. (2015), “Detecting Differential Patterns of Interaction in Molecular Pathways,”Biostatistics, 16, 240-251. DOI: .
[52] Yang, Y., Wainwright, M. J., and Jordan, M. I. (2016), “On the Computational Complexity of High-Dimensional Bayesian Variable Selection,”The Annals of Statistics, 44, 2497-2532. DOI: . · Zbl 1359.62088
[53] Zhang, A., Zhang, G., Calhoun, V. D., and Wang, Y.-P. (2020), “Causal Brain Network in Schizophrenia by a Two-Step Bayesian Network Analysis,” in Medical Imaging 2020: Imaging Informatics for Healthcare, Research, and Applications (Vol. 11318), pp. 316-321, SPIE. DOI: .
[54] Zhang, Y., and Yang, Q. (2021), “A Survey on Multi-Task Learning,”IEEE Transactions on Knowledge and Data Engineering, 34, 5586-5609. DOI: .
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.