×

On the (complete) reasons behind decisions. (English) Zbl 07727820

Summary: Recent work has shown that the input-output behavior of some common machine learning classifiers can be captured in symbolic form, allowing one to reason about the behavior of these classifiers using symbolic techniques. This includes explaining decisions, measuring robustness, and proving formal properties of machine learning classifiers by reasoning about the corresponding symbolic classifiers. In this work, we present a theory for unveiling the reasons behind the decisions made by Boolean classifiers and study some of its theoretical and practical implications. At the core of our theory is the notion of a complete reason, which can be viewed as a necessary and sufficient condition for why a decision was made. We show how the complete reason can be used for computing notions such as sufficient reasons (also known as PI-explanations and abductive explanations), how it can be used for determining decision and classifier bias and how it can be used for evaluating counterfactual statements such as “a decision will stick even if \(\dots\) because \(\dots\).” We present a linear-time algorithm for computing the complete reasoning behind a decision, assuming the classifier is represented by a Boolean circuit of appropriate form. We then show how the computed complete reason can be used to answer many queries about a decision in linear or polynomial time. We finally conclude with a case study that illustrates the various notions and techniques we introduced.

MSC:

03-XX Mathematical logic and foundations
68-XX Computer science

Software:

JBool; Reluplex

References:

[1] Audemard, G., Bellart, S., Bounia, L., Koriche, F., Lagniez, J. M., & Marquis, P. (2021) On the explanatory power of decision trees. CoRR arXiv:2108.05266
[2] Audemard, G., Bellart, S., Bounia, L., Koriche, F., Lagniez, J., & Marquis, P. (2021). On the computational intelligibility of Boolean classifiers. CoRR arXiv:abs/2104.06172
[3] Audemard, G., Koriche, F., & Marquis, P. (2020). On tractable XAI queries based on compiled representations. In KR (pp. 838-849).
[4] Barceló, P., Monet, M., Pérez, J., & Subercaseaux, B. (2020). Model interpretability through the lens of computational complexity. In NeurIPS.
[5] Beame, P., & Liew, V. (2015). New limits for knowledge compilation and applications to exact model counting. In UAI (pp. 131-140). AUAI Press.
[6] Beame, P., Li, J., Roy, S., & Suciu, D. (2013). Lower bounds for exact model counting and applications in probabilistic databases. In UAI. AUAI Press.
[7] Bollig, B.; Buttkus, M., On the relative succinctness of sentential decision diagrams, Theory of Computing Systems, 63, 6, 1250-1277 (2019) · Zbl 1435.68320 · doi:10.1007/s00224-018-9904-z
[8] Bryant, RE, Graph-based algorithms for Boolean function manipulation, IEEE Transactions on Computers, 35, 8, 677-691 (1986) · Zbl 0593.94022 · doi:10.1109/TC.1986.1676819
[9] Chan, H., & Darwiche, A. (2003). Reasoning about Bayesian network classifiers. In UAI (pp. 107-115). Morgan Kaufmann.
[10] Choi, A., Shih, A., Goyanka, A., & Darwiche, A. (2020). On symbolically encoding the behavior of random forests. CoRR arXiv:abs/2007.01493
[11] Coudert, O., & Madre, J. C. (1993). Fault tree analysis: \(10^{20}\) prime implicants and beyond. In Proceedings of the annual reliability and maintainability symposium.
[12] Coudert, O., Madre, J. C., Fraisse, H., & Touati, H. (1993). Implicit prime cover computation: An overview. In Proceedings of the 4th SASIMI workshop.
[13] Crama, Y., & Hammer, P. L. (2011). Boolean functions-theory, algorithms, and applications. In Encyclopedia of mathematics and its applications (vol. 142). Cambridge University Press. · Zbl 1237.06001
[14] Darwiche, A. (2004). New advances in compiling CNF into decomposable negation normal form. In ECAI (pp. 328-332). IOS Press.
[15] Darwiche, A. (2011). SDD: A new canonical representation of propositional knowledge bases. In IJCAI (pp. 819-826). IJCAI/AAAI.
[16] Darwiche, A. (2020). Three modern roles for logic in AI. In PODS (pp. 229-243). ACM
[17] Darwiche, A., & Hirth, A. (2020). On the reasons behind decisions. In ECAI, frontiers in artificial intelligence and applications (vol. 325, pp. 712-720). IOS Press. · Zbl 1464.68307
[18] Darwiche, A., & Ji, C. (2022). On the computation of necessary and sufficient explanations. In AAAI. AAAI Press.
[19] Darwiche, A., Decomposable negation normal form, Journal of the ACM, 48, 4, 608-647 (2001) · Zbl 1127.03321 · doi:10.1145/502090.502091
[20] Darwiche, A.; Marquis, P., A knowledge compilation map, Journal of Artificial Intelligence Research, 17, 229-264 (2002) · Zbl 1045.68131 · doi:10.1613/jair.989
[21] Darwiche, A.; Marquis, P., On literal quantification in Boolean logic and its applications to explainable AI, Journal of Artificial Intelligence Research, 72, 285-328 (2021) · Zbl 1522.68526 · doi:10.1613/jair.1.12756
[22] de Kleer, J.; Mackworth, AK; Reiter, R., Characterizing diagnoses and systems, Artificial Intelligence, 56, 2-3, 197-222 (1992) · Zbl 0772.68085 · doi:10.1016/0004-3702(92)90027-U
[23] Huang, J., & Darwiche, A. (2005). DPLL with a trace: From SAT to knowledge compilation. In IJCAI (pp. 156-162). Professional Book Center.
[24] Huang, X., Izza, Y., Ignatiev, A., & Marques-Silva, J. (2021). On efficiently explaining graph-based classifiers. CoRR arXiv:abs/2106.01350.
[25] Huang, X., Izza, Y., Ignatiev, A., Cooper, M.C., Asher, N., & Marques-Silva, J. (2021). Efficient explanations for knowledge compilation languages. CoRR arXiv:abs/2107.01654.
[26] Huang, J.; Darwiche, A., The language of search, Journal of Artificial Intelligence Research, 29, 191-219 (2007) · Zbl 1183.68232 · doi:10.1613/jair.2097
[27] Ignatiev, A., Narodytska, N., & Marques-Silva, J. (2019). Abduction-based explanations for machine learning models. In Proceedings of the thirty-three conference on artificial intelligence (AAAI) (pp. 1511-1519).
[28] Ignatiev, A., Narodytska, N., & Marques-Silva, J. (2019). On validating, repairing and refining heuristic ML explanations. CoRR arXiv:abs/1907.02509.
[29] Ignatiev, A., Narodytska, N., Asher, N., & Marques-Silva, J. (2020). From contrastive to abductive explanations and back again. In AI*IA, Lecture Notes in Computer Science, (vol. 12414, pp. 335-355). Springer.
[30] Ignatiev, A., Narodytska, N., Marques-Silva, J. (2019). On relating explanations and adversarial examples. In Advances in neural information processing systems (vol. 32, pp. 15883-15893). Curran Associates, Inc. URL http://papers.nips.cc/paper/9717-on-relating-explanations-and-adversarial-examples.pdf.
[31] Izza, Y., Ignatiev, A., & Marques-Silva, J. (2020). On explaining decision trees. CoRR arXiv:abs/2010.11034. · Zbl 07603113
[32] Katz, G.; Barrett, CW; Dill, DL; Julian, K.; Kochenderfer, MJ, Reluplex: An efficient SMT solver for verifying deep neural networks, Computer Aided Verification CAV, 5, 97-117 (2017) · Zbl 1494.68167
[33] Lagniez, J., & Marquis, P. (2017). An improved Decision-DNNF compiler. In IJCAI (pp. 667-673). ijcai.org.
[34] Leofante, F., Narodytska, N., Pulina, L., & Tacchella, A. (2018). Automated verification of neural networks: Advances, challenges and perspectives. CoRR arXiv:abs/1805.09938.
[35] Lindner, F.; Möllney, K.; Benzmüller, C.; Stuckenschmidt, H., Extracting reasons for moral judgments under various ethical principles, KI 2019: advances in artificial intelligence, 216-229 (2019), Cham: Springer International Publishing, Cham
[36] Marques-Silva, J., Gerspacher, T., Cooper, M. C., Ignatiev, A., & Narodytska, N. (2020). Explaining naive Bayes and other linear classifiers with polynomial time and delay. In NeurIPS.
[37] McCluskey, EJ, Minimization of Boolean functions, The Bell System Technical Journal, 35, 6, 1417-1444 (1956) · doi:10.1002/j.1538-7305.1956.tb03835.x
[38] Miller, T., Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence, 267, 1-38 (2019) · Zbl 1478.68274 · doi:10.1016/j.artint.2018.07.007
[39] Minato, S., Fast generation of prime-irredundant covers from binary decision diagrams, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 76, 6, 967-973 (1993)
[40] Narodytska, N., Kasiviswanathan, S. P., Ryzhyk, L., Sagiv, M., & Walsh, T. (2018). Verifying properties of binarized deep neural networks. In Proceedings of the thirty-second AAAI conference on artificial intelligence (AAAI).
[41] Oztok, U., & Darwiche, A. (2014). On compiling CNF into Decision-DNNF. In CP, Lecture Notes in Computer Science (vol. 8656, pp. 42-57). Springer.
[42] Quine, WV, The problem of simplifying truth functions, The American Mathematical Monthly, 59, 8, 521-531 (1952) · Zbl 0048.24503 · doi:10.1080/00029890.1952.11988183
[43] Quine, WV, On cores and prime implicants of truth functions, The American Mathematical Monthly, 66, 9, 755-760 (1959) · Zbl 0201.32203 · doi:10.1080/00029890.1959.11989404
[44] Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In AAAI (pp. 1527-1535). AAAI Press.
[45] Shi, W., Shih, A., Darwiche, A., & Choi, A. (2020). On tractable representations of binary neural networks. In KR (pp. 882-892).
[46] Shih, A., Choi, A., & Darwiche, A. (2018). A symbolic approach to explaining Bayesian network classifiers. In IJCAI (pp. 5103-5111). ijcai.org.
[47] Shih, A., Choi, A., & Darwiche, A. (2019). Compiling Bayesian network classifiers into decision graphs. In AAAI (pp. 7966-7974). AAAI Press.
[48] Shih, A., Darwiche, A., & Choi, A. (2019). Verifying binarized neural networks by angluin-style learning. In SAT, Lecture Notes in Computer Science (vol. 11628, pp. 354-370). Springer. · Zbl 1441.68214
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.