×

The dual PC algorithm and the role of Gaussianity for structure learning of Bayesian networks. (English) Zbl 07734025

Summary: Learning the graphical structure of Bayesian networks is key to describing data-generating mechanisms in many complex applications but poses considerable computational challenges. Observational data can only identify the equivalence class of the directed acyclic graph underlying a Bayesian network model, and a variety of methods exist to tackle the problem. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by reverse-engineering the conditional independence (CI) relationships holding in the variable distribution. The dual PC algorithm is a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. By exploiting block matrix inversions we can also perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies show that the dual PC algorithm outperforms the classic PC algorithm both in terms of run time and in recovering the underlying network structure, even in the presence of deviations from Gaussianity. Additionally, we show that the dual PC algorithm applies for Gaussian copula models, and demonstrate its performance in that setting.

MSC:

68T37 Reasoning under uncertainty in the context of artificial intelligence

Software:

TETRAD; pcalg

References:

[1] Andersson, S. A.; Madigan, D.; Perlman, M. D., A characterization of Markov equivalence classes for acyclic digraphs, Ann. Stat., 25, 505-541 (1997) · Zbl 0876.60095
[2] Angelopoulos, N.; Chatzipli, A.; Nangalia, J.; Maura, F.; Campbell, P. J., Bayesian networks elucidate complex genomic landscapes in cancer, Commun. Biol., 5, 1-12 (2022)
[3] Baba, K.; Shibata, R.; Sibuya, M., Partial correlation and conditional correlation as measures of conditional independence, Aust. N. Z. J. Stat., 46, 657-664 (2004) · Zbl 1061.62086
[4] Banf, M.; Rhee, S. Y., Computational inference of gene regulatory networks: approaches, limitations and opportunities, Biochim. Biophys. Acta, Gene Regul. Mech., 1860, 41-52 (2017)
[5] Bird, J. C.; Evans, R.; Waite, F.; Loe, B. S.; Freeman, D., Adolescent paranoia: prevalence, structure, and causal mechanisms, Schizophr. Bull., 45, 5, 1134-1142 (2018)
[6] Chakraborty, S.; Shojaie, A., Nonparametric causal structure learning in high dimensions (2021)
[7] Chaturvedi, I.; Ragusa, E.; Gastaldo, P.; Zunino, R.; Cambria, E., Bayesian network based extreme learning machine for subjectivity detection, J. Franklin Inst., 355, 4, 1780-1797 (2018), Special issue on recent advances in machine learning for signal analysis and processing · Zbl 1395.68271
[8] Chickering, D. M., Optimal structure identification with greedy search, Journal of Machine Learning Research, 3, 507-554 (2003) · Zbl 1084.68519
[9] Chickering, D. M.; Heckerman, D.; Meek, C., Large-sample learning of Bayesian networks is NP-hard, Journal of Machine Learning Research, 5, 1287-1330 (2004) · Zbl 1222.68169
[10] Colombo, D.; Maathuis, M. H., Order-independent constraint-based causal structure learning, Journal of Machine Learning Research, 15, 116, 3921-3962 (2014) · Zbl 1312.68165
[11] Constantinou, A. C.; Liu, Y.; Chobtham, K.; Guo, Z.; Kitson, N. K., Large-scale empirical validation of Bayesian network structure learning algorithms with noisy data, Int. J. Approx. Reason., 131, 151-188 (2021)
[12] Cox, D. R.; Small, N., Testing multivariate normality, Biometrika, 65, 263-272 (1978) · Zbl 0386.62041
[13] Cui, R.; Groot, P.; Heskes, T., Copula PC algorithm for causal discovery from mixed data, (Frasconi, P.; Landwehr, N.; Manco, G.; Vreeken, J., Machine Learning and Knowledge Discovery in Databases (2016), Springer International Publishing: Springer International Publishing Cham), 377-392
[14] de Campos, L. M.; Romero, A. E., Bayesian network models for hierarchical text classification from a thesaurus, Int. J. Approx. Reason., 50, 7, 932-944 (2009), Special section on graphical models and information retrieval
[15] Le Duy, T.; Hoang, T.; Li, J.; Liu, L.; Liu, H., A fast PC algorithm for high dimensional causal discovery with multi-core PCs (2015)
[16] Elwert, F., Graphical Causal Models, 245-273 (2013), Springer Netherlands: Springer Netherlands Dordrecht
[17] Friedman, N., Inferring cellular networks using probabilistic graphical models, Science, 303, 5659, 799-805 (2004)
[18] Friedman, N.; Linial, M.; Nachman, I.; Pe’er, D., Using Bayesian networks to analyze expression data, J. Comput. Biol., 7, 3-4, 601-620 (2000), PMID: 11108481
[19] Glymour, C.; Zhang, K.; Spirtes, P., Review of causal discovery methods based on graphical models, Front. Genet., 10, 524 (2019)
[20] Hawkins, D., Using U statistics to derive the asymptotic distribution of Fisher’s Z statistic, Am. Stat., 43, 235-237 (1989)
[21] Kalisch, M.; Bühlmann, P., Estimating high-dimensional directed acyclic graphs with the PC-algorithm, Journal of Machine Learning Research, 8, 613-636 (2007) · Zbl 1222.68229
[22] Kalisch, M.; Bühlmann, P., Robustification of the PC-algorithm for directed acyclic graphs, J. Comput. Graph. Stat., 17, 4, 773-789 (2008)
[23] Kalisch, M.; Mächler, M.; Colombo, D.; Maathuis, M.; Bühlmann, P., Causal inference using graphical models with the R package pcalg, J. Stat. Softw., 47, 11, 1-26 (2012)
[24] Khatri, C.; Rao, C. R., Characterizations of multivariate normality. I. Through independence of some statistics, J. Multivar. Anal., 6, 81-94 (1976) · Zbl 0367.62059
[25] Koller, D.; Friedman, N., Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning (2009), The MIT Press · Zbl 1183.68483
[26] Kuipers, J.; Suter, P.; Moffa, G., Efficient sampling and structure learning of Bayesian networks, J. Comput. Graph. Stat., 31, 639-650 (2022) · Zbl 07633197
[27] Kuipers, J.; Thurnherr, T.; Moffa, G.; Suter, P.; Behr, J.; Goosen, R.; Christofori, G.; Beerenwinkel, N., Mutational interactions define novel cancer subgroups, Nat. Commun., 9 (2018)
[28] Lauritzen, S. L., Graphical Models (1996), Oxford University Press · Zbl 0907.62001
[29] Liu, H.; Han, F.; Yuan, M.; Lafferty, J.; Wasserman, L., High-dimensional semiparametric Gaussian copula graphical models, Ann. Stat., 40, 2293-2326 (2012) · Zbl 1297.62073
[30] Liu, H.; Lafferty, J.; Wasserman, L., The nonparanormal: semiparametric estimation of high dimensional undirected graphs, J. Mach. Learn. Res., 10, 2295-2328 (2009) · Zbl 1235.62035
[31] Maathuis, M. H.; Kalisch, M.; Bühlmann, P., Estimating high-dimensional intervention effects from observational data, Ann. Stat., 37, 6A, 3133-3164 (2009) · Zbl 1191.62118
[32] Meek, C., Causal inference and causal explanation with background knowledge, (Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI’95 (1995), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA), 403-410
[33] Moffa, G.; Catone, G.; Kuipers, J.; Kuipers, E.; Freeman, D.; Marwaha, S.; Lennox, B. R.; Broome, M. R.; Bebbington, P., Using directed acyclic graphs in epidemiological research in psychosis: an analysis of the role of bullying in psychosis, Schizophr. Bull., 43, 6, 1273-1279 (2017)
[34] Moffa, G.; Kuipers, J.; Carrà, G.; Crocamo, C.; Kuipers, E.; Angermeyer, M.; Brugha, T.; Toumi, M.; Bebbington, P., Longitudinal symptomatic interactions in long-standing schizophrenia: a novel five-point analysis based on directed acyclic graphs, Psychol. Med., 1-8 (2021)
[35] Musella, F., A PC algorithm variation for ordinal variables, Comput. Stat., 28, 6, 2749-2759 (2013) · Zbl 1306.65113
[36] Neil, M.; Fenton, N.; Osman, M.; McLachlan, S., Bayesian network analysis of Covid-19 data reveals higher infection prevalence rates and lower fatality rates than widely reported, J. Risk Res., 23, 7-8, 866-879 (2020)
[37] Nelson, L. S., A dictionary of statistical terms, 5th ed., J. Qual. Technol., 23, 2, 167-168 (1991)
[38] Ojha, R.; Ghadge, A.; Tiwari, M. K.; Bititci, U. S., Bayesian network modelling for supply chain risk propagation, Int. J. Prod. Res., 56, 17, 5795-5819 (2018)
[39] Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (1988), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
[40] Pearl, J., Causality: Models, Reasoning and Inference (2009), Cambridge University Press: Cambridge University Press USA · Zbl 1188.68291
[41] Pearl, J.; Geiger, D.; Verma, T., Conditional independence and its representations, Kybernetika, 25, 7, 33-44 (1989)
[42] Pearl, J.; Russel, S., Bayesian networks (2000), UCLA Cognitive Systems Laboratory, Technical report (R-277)
[43] Rios, F. L.; Moffa, G.; Kuipers, J., Benchpress: a scalable and platform-independent workflow for benchmarking structure learning algorithms for graphical models (2021)
[44] Robinson, R. W., Counting unlabeled acyclic digraphs, (Little, C. H.C., Combinatorial Mathematics V (1977), Springer Berlin Heidelberg), 28-43 · Zbl 0376.05031
[45] Silverstein, C.; Brin, S.; Motwani, R.; Ullman, J., Scalable techniques for mining causal structures, Data Min. Knowl. Discov., 4, 163-192 (2000)
[46] Sondhi, A.; Shojaie, A., The reduced PC-algorithm: improved causal structure learning in large random networks, J. Mach. Learn. Res., 20, 164, 1-31 (2019) · Zbl 1446.62256
[47] Spirtes, P.; Glymour, C.; Scheines, R., Causation, Prediction, and Search, vol. 81 (1993), Springer New York: Springer New York NY · Zbl 0806.62001
[48] Tsamardinos, I.; Brown, L.; Aliferis, C., The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn., 65, 31-78 (2006) · Zbl 1470.68192
[49] Verma, T.; Pearl, J., Causal networks: semantics and expressiveness, (Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence. Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’88 (1988), North-Holland Publishing Co.: North-Holland Publishing Co. NLD), 69-78
[50] Verma, T.; Pearl, J., Equivalence and synthesis of causal models, (Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence. Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90 (1990), Elsevier Science Inc.: Elsevier Science Inc. USA), 255-270
[51] Viinikka, J.; Hyttinen, A.; Pensar, J.; Koivisto, M., Towards scalable Bayesian learning of causal DAGs, (Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M. F.; Lin, H., Advances in Neural Information Processing Systems, vol. 33 (2020), Curran Associates, Inc.), 6584-6594
[52] Zhang, X.; Zhao, X.-M.; He, K.; Lu, L.; Cao, Y.; Liu, J.; Hao, J.-K.; Liu, Z.-P.; Chen, L., Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information, Bioinformatics, 28, 1, 98-104 (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.