×

A local method for identifying causal relations under Markov equivalence. (English) Zbl 07505977

Summary: Causality is important for designing interpretable and robust methods in artificial intelligence research. We propose a local approach to identify whether a variable is a cause of a given target under the framework of causal graphical models of directed acyclic graphs (DAGs). In general, the causal relation between two variables may not be identifiable from observational data as many causal DAGs encoding different causal relations are Markov equivalent. In this paper, we first introduce a sufficient and necessary graphical condition to check the existence of a causal path from a variable to a target in every Markov equivalent DAG. Next, we provide local criteria for identifying whether a variable is a cause/non-cause of a target based only on the local structure instead of the entire graph. Finally, we propose a local learning algorithm for this causal query via learning the local structure of the variable and some additional statistical independence tests related to the target. Simulation studies show that our local algorithm is efficient and effective, compared with other state-of-art methods.

MSC:

68Txx Artificial intelligence

References:

[1] Ali, A. R.; Richardson, T. S.; Spirtes, P.; Zhang, J., Towards characterizing Markov equivalence classes for directed acyclic graphs with latent variables, (Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (2005), AUAI Press), 10-17
[2] Andersson, S. A.; Madigan, D.; Perlman, M. D., A characterization of Markov equivalence classes for acyclic digraphs, Ann. Stat., 25, 505-541 (1997) · Zbl 0876.60095
[3] Bengio, Y.; Deleu, T.; Rahaman, N.; Ke, N. R.; Lachapelle, S.; Bilaniuk, O.; Goyal, A.; Pal, C., A meta-transfer objective for learning to disentangle causal mechanisms, (International Conference on Learning Representations (2020))
[4] Bernstein, M.; Tetali, P., On sampling graphical Markov models (2017), arXiv e-prints
[5] Blair, J. R.S.; Peyton, B., An introduction to chordal graphs and clique trees, (Graph Theory and Sparse Matrix Computation (1993), Springer New York: Springer New York New York, NY), 1-29 · Zbl 0803.68081
[6] Chickering, D. M., Learning equivalence classes of Bayesian-network structures, J. Mach. Learn. Res., 2, 445-498 (2002) · Zbl 1007.68179
[7] Chickering, D. M., Optimal structure identification with greedy search, J. Mach. Learn. Res., 3, 507-554 (2002) · Zbl 1084.68519
[8] Claassen, T.; Heskes, T., A logical characterization of constraint-based causal discovery, (Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (2011), AUAI Press), 135-144
[9] Cohen, J., A coefficient of agreement for nominal scales, Educ. Psychol. Meas., 20, 37-46 (1960)
[10] Colombo, D.; Maathuis, M. H., Order-independent constraint-based causal structure learning, J. Mach. Learn. Res., 15, 3921-3962 (2014) · Zbl 1312.68165
[11] Cooper, G. F., A simple constraint-based algorithm for efficiently mining observational databases for causal relationships, Data Min. Knowl. Discov., 1, 203-224 (1997)
[12] DeLong, E. R.; DeLong, D. M.; Clarke-Pearson, D. L., Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics, 44, 837-845 (1988) · Zbl 0715.62207
[13] Entner, D.; Hoyer, P.; Spirtes, P., Data-driven covariate selection for nonparametric estimation of causal effects, (Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR (2013)), 256-264
[14] Fang, Z.; He, Y., IDA with background knowledge, (Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, PMLR (2020))
[15] Friedman, N.; Goldszmidt, M.; Wyner, A., Data analysis with Bayesian networks: a bootstrap approach, (Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (1999), Morgan Kaufmann Publishers Inc.), 196-205
[16] Frydenberg, M., The chain graph Markov property, Scand. J. Stat., 17, 333-353 (1990) · Zbl 0713.60013
[17] Fu, S.; Desmarais, M. C., Markov blanket based feature selection: a review of past decade, (Proceedings of the World Congress on Engineering (2010), Newswood Ltd.), 321-328
[18] Gao, T.; Ji, Q., Local causal discovery of direct causes and effects, (Advances in Neural Information Processing Systems, vol. 28 (2015), Curran Associates, Inc.), 2512-2520
[19] Guo, F. R.; Perković, E., Minimal enumeration of all possible total effects in a Markov equivalence class (2020), arXiv e-prints
[20] Hauser, A.; Bühlmann, P., Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs, J. Mach. Learn. Res., 13, 2409-2464 (2012) · Zbl 1433.68346
[21] He, Y.; Geng, Z., Active learning of causal networks with intervention experiments and optimal designs, J. Mach. Learn. Res., 9, 2523-2547 (2008) · Zbl 1225.68184
[22] He, Y.; Jia, J.; Yu, B., Counting and exploring sizes of Markov equivalence classes of directed acyclic graphs, J. Mach. Learn. Res., 16, 2589-2609 (2015) · Zbl 1351.68211
[23] Kalisch, M.; Mächler, M.; Colombo, D.; Maathuis, M. H.; Bühlmann, P., Causal inference using graphical models with the R package pcalg, J. Stat. Softw., 47, 1-26 (2012)
[24] Koivisto, M.; Sood, K., Exact Bayesian structure discovery in Bayesian networks, J. Mach. Learn. Res., 5, 549-573 (2004) · Zbl 1222.68234
[25] Kusner, M. J.; Loftus, J.; Russell, C.; Silva, R., Counterfactual fairness, (Advances in Neural Information Processing Systems, vol. 30 (2017), Curran Associates, Inc.), 4066-4076
[26] Lauritzen, S. L.; Richardson, T. S., Chain graph models and their causal interpretations, J. R. Stat. Soc., Ser. B, Stat. Methodol., 64, 321-348 (2002) · Zbl 1090.62103
[27] Lauritzen, S. L.; Spiegelhalter, D. J., Local computations with probabilities on graphical structures and their application to expert systems, J. R. Stat. Soc., Ser. B, Stat. Methodol., 50, 157-224 (1988) · Zbl 0684.68106
[28] Liu, Y.; Cai, Z.; Liu, C.; Geng, Z., Local learning approaches for finding effects of a specified cause and their causal paths, ACM Trans. Intell. Syst. Technol., 10 (2019)
[29] Liu, Y.; Fang, Z.; He, Y.; Geng, Z., Collapsible IDA: collapsing parental sets for locally estimating possible causal effects, (Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, PMLR (2020))
[30] Liu, Y.; Fang, Z.; He, Y.; Geng, Z.; Liu, C., Local causal network learning for finding pairs of total and direct effects, J. Mach. Learn. Res., 21, 1-37 (2020) · Zbl 1525.68119
[31] Maathuis, M. H.; Colombo, D., A generalized back-door criterion, Ann. Stat., 43, 1060-1088 (2015) · Zbl 1320.62157
[32] Maathuis, M. H.; Colombo, D.; Kalisch, M.; Bühlmann, P., Predicting causal effects in large-scale systems from observational data, Nat. Methods, 7, 247-248 (2010)
[33] Maathuis, M. H.; Kalisch, M.; Bühlmann, P., Estimating high-dimensional intervention effects from observational data, Ann. Stat., 37, 3133-3164 (2009) · Zbl 1191.62118
[34] Magliacane, S.; Claassen, T.; Mooij, J. M., Ancestral causal inference, (Lee, D.; Sugiyama, M.; Luxburg, U.; Guyon, I.; Garnett, R., Advances in Neural Information Processing Systems (2016), Curran Associates, Inc.), 4466-4474
[35] Mani, S.; Spirtes, P.; Cooper, G. F., A theoretical study of y structures for causal discovery, (Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (2006), AUAI Press), 314-323
[36] Meek, C., Causal inference and causal explanation with background knowledge, (Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (1995), Morgan Kaufmann Publishers Inc.), 403-410
[37] Miller, T., Explanation in artificial intelligence: insights from the social sciences, Artif. Intell., 267, 1-38 (2019) · Zbl 1478.68274
[38] Mooij, J.; Claassen, T., Constraint-based causal discovery using partial ancestral graphs in the presence of cycles, (Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, PMLR (2020))
[39] Nandy, P.; Maathuis, M. H.; Richardson, T. S., Estimating the effect of joint interventions from observational data in sparse high-dimensional settings, Ann. Stat., 45, 647-674 (2017) · Zbl 1426.62286
[40] Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (1988), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
[41] Pearl, J., Causality (2009), Cambridge University Press · Zbl 1188.68291
[42] Pearl, J.; Geiger, D.; Verma, T., Conditional independence and its representations, Kybernetika, 25, 33-44 (1989)
[43] Perković, E., Identifying causal effects in maximally oriented partially directed acyclic graphs, (Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence. Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, PMLR (2020))
[44] Perković, E.; Kalisch, M.; Maathuis, M. H., Interpreting and using CPDAGs with background knowledge, (Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (2017), AUAI Press)
[45] Peters, J.; Bühlmann, P., Identifiability of Gaussian structural equation models with equal error variances, Biometrika, 101, 219-228 (2013) · Zbl 1285.62005
[46] Peters, J.; Mooij, J. M.; Janzing, D.; Schölkopf, B., Causal discovery with continuous additive noise models, J. Mach. Learn. Res., 15, 2009-2053 (2014) · Zbl 1318.68151
[47] Richardson, T.; Spirtes, P., Ancestral graph Markov models, Ann. Stat., 30, 962-1030 (2002) · Zbl 1033.60008
[48] Rose, D. J.; Tarjan, R. E., Algorithmic Aspects of Vertex Elimination on Directed Graphs (1975), Technical Report, Stanford, CA, USA · Zbl 0382.05049
[49] Roumpelaki, A.; Borboudakis, G.; Triantafillou, S.; Tsamardinos, I., Marginal causal consistency in constraint-based causal learning, (Proceedings of the UAI 2016 Workshop on Causation: Foundation to Application (2016)), 39-47
[50] Sadeghi, K., Faithfulness of probability distributions and graphs, J. Mach. Learn. Res., 18, 1-29 (2017) · Zbl 1444.62079
[51] Shi, C.; Li, L., Testing mediation effects using logic of boolean matrices, J. Am. Stat. Assoc., 1-14 (2021)
[52] Shimizu, S.; Hoyer, P. O.; Hyvärinen, A.; Kerminen, A., A linear non-Gaussian acyclic model for causal discovery, J. Mach. Learn. Res., 7, 2003-2030 (2006) · Zbl 1222.68304
[53] Shimizu, S.; Inazumi, T.; Sogawa, Y.; Hyvärinen, A.; Kawahara, Y.; Washio, T.; Hoyer, P. O.; Bollen, K., Directlingam: a direct method for learning a linear non-Gaussian structural equation model, J. Mach. Learn. Res., 12, 1225-1248 (2011) · Zbl 1280.68195
[54] Singh, A.; Moore, A., Finding optimal Bayesian networks by dynamic programming (2005), Technical Report
[55] Spirtes, P.; Glymour, C., An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9, 62-72 (1991)
[56] Spirtes, P.; Glymour, C. N.; Scheines, R., Causation, Prediction, and Search (2000), MIT Press
[57] Tsamardinos, I.; Aliferis, C. F., Towards principled feature selection: relevancy, filters and wrappers, (Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (2003), Morgan Kaufmann Publishers)
[58] Tsamardinos, I.; Aliferis, C. F.; Statnikov, A. R., Algorithms for large scale Markov blanket discovery, (Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference (2003), AAAI Press), 376-381
[59] Wang, C.; Zhou, Y.; Zhao, Q.; Geng, Z., Discovering and orienting the edges connected to a target variable in a DAG via a sequential local learning approach, Comput. Stat. Data Anal., 77, 252-266 (2014) · Zbl 1506.62184
[60] Witte, J.; Henckel, L.; Maathuis, M. H.; Didelez, V., On efficient adjustment in causal graphs, J. Mach. Learn. Res., 21, 1-45 (2020) · Zbl 1532.68086
[61] Wu, Y.; Zhang, L.; Wu, X., Counterfactual fairness: unidentification, bound and algorithm, (Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19 (2019), International Joint Conferences on Artificial Intelligence Organization), 1438-1444
[62] Xiang, J.; Kim, S., A* lasso for learning a sparse Bayesian network structure for continuous variables, (Advances in Neural Information Processing Systems, vol. 26 (2013), Curran Associates, Inc.), 2418-2426
[63] Yuan, C.; Malone, B.; Wu, X., Learning optimal Bayesian networks using A* search, (Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, IJCAI-11 (2011), International Joint Conferences on Artificial Intelligence Organization), 2186-2191
[64] Zhang, J., Causal Inference and Reasoning in Causally Insufficient Systems (2006), Carnegie Mellon University, Ph.D. thesis
[65] Zhang, J., On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias, Artif. Intell., 172, 1873-1896 (2008) · Zbl 1184.68434
[66] Zhang, K.; Gong, M.; Stojanov, P.; Huang, B.; Liu, Q.; Glymour, C., Domain adaptation as a problem of inference on graphical models, (Advances in Neural Information Processing Systems, vol. 33 (2020), Curran Associates, Inc.), 4965-4976
[67] Zhang, K.; Hyvärinen, A., On the identifiability of the post-nonlinear causal model, (Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (2009), AUAI Press)
[68] Zheng, X.; Aragam, B.; Ravikumar, P. K.; Xing, E. P., DAGs with no tears: continuous optimization for structure learning, (Advances in Neural Information Processing Systems, vol. 31 (2018), Curran Associates, Inc.), 9472-9483
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.