×

Two optimal strategies for active learning of causal models from interventional data. (English) Zbl 1390.68530

Summary: From observational data alone, a causal DAG is only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, that is, the intervened variables. We present active learning (that is, optimal experimental design) strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of F. Eberhardt [“Almost optimal intervention sets for causal discovery”, in: Proceedings of the 24th conference on uncertainty in artificial intelligence, UAI 2008. Corvallis, OR: AUAI Press. 161–168 (2008), https://dslpitt.org/papers/08/p161-eberhardt.pdf] indicating the number of unbounded intervention targets which is sufficient and in the worst case necessary for full identifiability. In a simulation study, we compare our two active learning approaches to random interventions and an existing approach, and analyze the influence of estimation errors on the overall performance of active learning.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T37 Reasoning under uncertainty in the context of artificial intelligence

Software:

pcalg; TETRAD

References:

[1] Eberhardt, F., Almost optimal intervention sets for causal discovery, (Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI 2008) (2008)), 161-168
[2] Peters, J.; Mooij, J.; Janzing, D.; Schölkopf, B., Identifiability of causal graphs using functional models, (Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011) (2011)), 589-598
[3] Verma, T.; Pearl, J., On the equivalence of causal models, (Proceedings of the 6th Conference on Uncertainty in Artificial Intelligence (UAI 1990) (1990)), 220-227
[4] Andersson, S. A.; Madigan, D.; Perlman, M. D., A characterization of Markov equivalence classes for acyclic digraphs, Ann. Stat., 25, 505-541 (1997) · Zbl 0876.60095
[5] Spirtes, P.; Glymour, C. N.; Scheines, R., Causation, Prediction, and Search (2000), MIT Press
[6] Hauser, A.; Bühlmann, P., Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs, J. Mach. Learn. Res., 13, 2409-2464 (2012) · Zbl 1433.68346
[7] Hauser, A.; Bühlmann, P., Two optimal strategies for active learning of causal models from interventions, (Proceedings of the 6th European Workshop on Probabilistic Graphical Models (PGM 2012) (2012)), 123-130
[8] Settles, B., Active Learning (2012), Morgan & Claypool Publishers · Zbl 1270.68006
[9] He, Y.-B.; Geng, Z., Active learning of causal networks with intervention experiments and optimal designs, J. Mach. Learn. Res., 9, 2523-2547 (2008) · Zbl 1225.68184
[10] Meganck, S.; Leray, P.; Manderick, B., Learning causal Bayesian networks from observations and experiments: a decision theoretic approach, (Modeling Decisions for Artificial Intelligence (2006), Springer), 58-69 · Zbl 1235.68176
[11] Tong, S.; Koller, D., Active learning for structure in Bayesian networks, (17th International Joint Conference on Artificial Intelligence (IJCAI 2001) (2011)), 863-869
[12] Masegosa, A. R.; Moral, S., An interactive approach for Bayesian network learning using domain/expert knowledge, Int. J. Approx. Reason., 54, 1168-1181 (2013)
[13] Korb, K. B.; Hope, L. R.; Nicholson, A. E.; Axnick, K., Varieties of causal intervention, (Pacific Rim International Conference on Artificial Intelligence (PRICAI 2004) (2004), Springer: Springer New York), 322-331
[14] Pearl, J., Causal diagrams for empirical research, Biometrika, 82, 669-688 (1995) · Zbl 0860.62045
[15] Rose, D. J., Triangulated graphs and the elimination process, J. Math. Anal. Appl., 32, 597-609 (1970) · Zbl 0216.02602
[16] Rose, D. J.; Tarjan, R. E.; Lueker, G. S., Algorithmic aspects of vertex elimination on graphs, SIAM J. Comput., 5, 266-283 (1976) · Zbl 0353.65019
[17] Chvátal, V., Perfectly ordered graphs, Ann. Discrete Math., 21, 63-68 (1984) · Zbl 0559.05055
[18] Eberhardt, F., Causation and intervention (2007), Carnegie Mellon University, PhD thesis
[19] Cai, M.-C., On separating systems of graphs, Discrete Math., 49, 15-20 (1984) · Zbl 0498.05052
[20] Chickering, D. M., Optimal structure identification with greedy search, J. Mach. Learn. Res., 3, 507-554 (2002) · Zbl 1084.68519
[21] Brown, L. E.; Tsamardinos, I.; Aliferis, C. F., A comparison of novel and state-of-the-art polynomial bayesian network learning algorithms, (Proceedings of the 20th National Conference on Artificial Intelligence (AAAI 2005) (2005)), 739-745
[22] Kalisch, M.; Bühlmann, P., Estimating high-dimensional directed acyclic graphs with the PC-algorithm, J. Mach. Learn. Res., 8, 613-636 (2007) · Zbl 1222.68229
[23] Kaplan, E.; Meier, P., Nonparametric estimation from incomplete observations, J. Am. Stat. Assoc., 457-481 (1958) · Zbl 0089.14801
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.