×

Robustifying sum-product networks. (English) Zbl 1453.62533

Summary: Sum-product networks are a relatively new and increasingly popular family of probabilistic graphical models that allow for marginal inference with polynomial effort. They have been shown to achieve state-of-the-art performance in several tasks involving density estimation. Sum-product networks are typically learned from data; as such, inferences produced with them are prone to be unreliable and overconfident when data is scarce. In this work, we develop the credal sum-product networks, a generalization of sum-product networks that uses set-valued parameters. We present algorithms and complexity results for common inference tasks with this class of models. We also present an approach for assessing the reliability of classifications made with sum-product networks. We apply this approach on benchmark classification tasks as well as a new application in predicting the age of stars. Our experiments show that the use of credal sum-product networks allow us to distinguish between reliable and unreliable classifications with higher accuracy than standard approaches based on (precise) probability values.

MSC:

62H22 Probabilistic graphical models
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G35 Nonparametric robustness
62P35 Applications of statistics to physics
85A15 Galactic and stellar structure

Software:

UCI-ml

References:

[1] Adel, T.; Balduzzi, D.; Ghodsi, A., Learning the structure of sum-product networks via an SVD-based algorithm, (Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (2015)), 32-41
[2] Amer, M. R.; Todorovic, S., Sum product networks for activity recognition, IEEE Trans. Pattern Anal. Mach. Intell., 38, 4, 800-813 (2016)
[3] Augustin, T.; Coolen, F. P.A.; De Cooman, G.; Troffaes, M. C.M., Introduction to Imprecise Probabilities (2014), John Wiley and Sons · Zbl 1290.62003
[4] Aurière, M., Stellar Polarimetry with NARVAL, EAS Publ. Ser., vol. 9, 105 (2003)
[5] Basri, G.; Borucki, W. J.; Koch, D., The Kepler mission: a wide-field transit search for terrestrial planets, New Astron. Rev., 49, 7-9, 478-485 (2005)
[6] Boutilier, C.; Friedman, N.; Goldszmidt, M.; Koller, D., Context-specific independence in Bayesian networks, (Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence (1996)), 115-123
[7] Bruntt, H.; Basu, S.; Smalley, B.; Chaplin, W. J.; Verner, G. A.; Bedding, T. R., Accurate fundamental parameters and detailed abundance patterns from spectroscopy of 93 solar-type Kepler targets, Mon. Not. R. Astron. Soc., 423, 1, 122-131 (2012)
[8] Chaplin, W. J.; Basu, S.; Huber, D.; Serenelli, A.; Casagrande, L.; Silva Aguirre, V., Asteroseismic fundamental properties of solar-type stars observed by the NASA Kepler mission, Astrophys. J. Suppl. Ser., 210, 1, 22 (2014)
[9] Chavira, M.; Darwiche, A., Compiling Bayesian networks with local structure, (Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (2005)), 1306-1312
[10] Cheng, W.-C.; Kok, S.; Pham, H. V.; Chieu, H. L.; Chai, K. M.A., Language modeling with sum-product networks, (Proceedings of the 15th Annual Conference of the International Speech Communication Association (2014)), 2098-2102
[11] Conaty, D.; Mauá, D. D.; de Campos, C. P., Approximation complexity of maximum a posteriori inference in sum-product networks, (Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (2017)), 322-331
[12] Cozman, F. G., Credal networks, Artif. Intell., 120, 2, 199-233 (2000) · Zbl 0945.68163
[13] Cozman, F. G., Graphical models for imprecise probabilities, Int. J. Approx. Reason., 39, 2-3, 167-184 (2005) · Zbl 1099.68111
[14] Dalmao, F. B.S.; Pitassi, T., Value elimination: Bayesian inference via backtracking search, (Proceedings of the Uncertainty in Artificial Intelligence (2003)), 20-28
[15] Darwiche, A., A differential approach to inference in Bayesian networks, (Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence (2000)), 123-132
[16] Darwiche, A., A differential approach to inference in Bayesian networks, J. ACM, 50, 3, 280-305 (2003) · Zbl 1325.68226
[17] Darwiche, A., Modeling and Reasoning with Bayesian Networks (2009), Cambridge University Press · Zbl 1231.68003
[18] Darwiche, A.; Provan, G. M., Query DAGs: a practical paradigm for implementing belief-network inference, (Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (1996)), 203-210 · Zbl 0894.68134
[19] de Bock, J.; Antonucci, A.; de Campos, C. P., Global sensitivity analysis for MAP inference in graphical models, (Neural Information Processing Systems (2014)), 2690-2698
[20] de Campos, C. P.; Cozman, F. G., Inference in credal networks through integer programming, (International Symposium on Imprecise Probability: Theories and Applications (2007)), 145-154
[21] Dechter, R., Bucket elimination: a unifying framework for reasoning, Artif. Intell., 113, 1-2, 41-85 (1999) · Zbl 0939.68847
[22] Dennis, A.; Ventura, D., Learning the architecture of sum-product networks using clustering on variables, (Advances in Neural Information Processing Systems, vol. 25 (2012)), 2042-2050
[23] Dennis, A.; Ventura, D., Greedy structure search for sum-product networks, (Proceedings of the 24th International Joint Conference on Artificial Intelligence (2015)), 932-938
[24] Donati, J.-F., ESPaDOnS: an echelle spectropolarimetric device for the observation of stars at CFHT, (Astronomical Society of the Pacific Conference Proceedings. Astronomical Society of the Pacific Conference Proceedings, San Francisco (2003)), 41
[25] Drenick, R. F., Multilinear programming: duality theories, J. Optim. Theory Appl., 81, 2, 421-422 (1994) · Zbl 0810.90116
[26] Gens, R.; Domingos, P., Discriminative learning of sum-product networks, (Advances in Neural Information Processing Systems, vol. 25 (2012)), 3239-3247
[27] Gens, R.; Domingos, P., Learning the structure of sum-product networks, (Proceedings of the Thirtieth International Conference on Machine Learning (2013)), 873-880
[28] Heckerman, D., A tractable inference algorithm for diagnosing multiple diseases, (Proceedings of the Fifth Conference on Uncertainty in Artificial Intelligence (1989)), 163-171 · Zbl 0721.68086
[29] Huntley, N.; Hable, R.; Troffaes, M. C.M., Decision Making, 190-206 (2014), John Wiley and Sons, chap. 8 · Zbl 1298.91075
[30] Koller, D.; Friedman, N., Probabilistic Graphical Models (2009), The MIT Press · Zbl 1183.68483
[31] Korte, B.; Vygen, J., Combinatorial Optimization: Theory and Algorithms, Algorithms and Combinatorics (2012), Springer · Zbl 1237.90001
[32] Kschischang, F. R.; Frey, B. J.; Loeliger, H.-A., Factor graphs and the sum-product algorithm, IEEE Trans. Inf. Theory, 47, 2, 498-519 (2001) · Zbl 0998.68234
[33] Larkin, D.; Dechter, R., Bayesian inference in the presence of determinism, (Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (2003))
[34] Lauritzen, S.; Spiegelhalter, D., Local computation with probabilities on graphical structures and their application to expert systems, J. R. Stat. Soc., Ser. B, 50, 2, 157-224 (1988) · Zbl 0684.68106
[35] Lee, S.-W.; Watkins, C.; Zhang, B.-T., Non-parametric Bayesian sum-product networks, (ICML Workshop on Learning Tractable Probabilistic Models, vol. 32 (2014))
[36] Levi, I., The Enterprise of Knowledge (1980), MIT Press: MIT Press London
[37] Lichman, M., UCI machine learning repository (2013)
[38] Llerena, J. V.; Mauá, D. D., On using sum-product networks for multi-label classification, (Proceedings of the 6th Brazilian Conference on Intelligent Systems (2017)), 25-30
[39] Mauá, D. D.; Cozman, F. G.; Conaty, D.; de Campos, C. P., Credal sum-product networks, (Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications (2017)), 205-216
[40] Mauá, D. D.; de Campos, C. P.; Benavoli, A.; Antonucci, A., Probabilistic inference in credal networks: new complexity results, J. Artif. Intell. Res., 50, 603-637 (2014) · Zbl 1366.68329
[41] Mauá, D. D.; de Campos, C. P.; Zaffalon, M., Updating credal networks is approximable in polynomial time, Int. J. Approx. Reason., 53, 8, 1183-1199 (2012) · Zbl 1266.68178
[42] Mauá, D. D.; de Campos, C. P.; Zaffalon, M., On the complexity of solving polytree-shaped limited memory influence diagrams with binary variables, Artif. Intell., 205, 30-38 (2013) · Zbl 1334.68202
[43] Nath, A.; Domingos, P., Learning tractable probabilistic models for fault localization, (Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016)), 1294-1301
[44] Peharz, R.; Geiger, B. C.; Pernkopf, F., Greedy part-wise learning of sum-product networks, (Machine Learning and Knowledge Discovery in Databases. Machine Learning and Knowledge Discovery in Databases, Lect. Notes Artif. Intell., vol. 8189 (2013)), 612-627
[45] Peharz, R.; Gens, R.; Domingos, P., Learning selective sum-product networks, (ICML Workshop on Learning Tractable Probabilistic Models, vol. 32 (2014))
[46] Peharz, R.; Gens, R.; Pernkopf, F.; Domingos, P., On the latent variable interpretation in sum-product networks, IEEE Trans. Pattern Anal. Mach. Intell., 1-14 (2016)
[47] Peharz, R.; Tschiatschek, S.; Pernkopf, F.; Domingos, P., On theoretical properties of sum-product networks, (Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (2015)), 744-752
[48] Poon, H.; Domingos, P., Sum-product networks: a new deep architecture, (Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (2011)), 337-346
[49] Pronobis, A.; Rao, R. P.N., Learning deep generative spatial models for mobile robots, (Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (2017)), 755-762
[50] Pronobis, A.; Riccio, F.; Rao, R. P.N., Deep spatial affordance hierarchy: spatial knowledge representation for planning in large-scale environments, (ICAPS 2017 Workshop on Planning and Robotics (2017))
[51] Rahman, T.; Gogate, V., Merging strategies for sum-product networks: from trees to graphs, (Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (2016)), 617-626
[52] Rathke, F.; Desana, M.; Schnörr, C., Locally adaptive probabilistic models for global segmentation of pathological OCT scans, (Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (2017)), 177-184
[53] Rooshenas, A.; Lowd, D., Learning sum-product networks with direct and indirect variable interactions, (Proceedings of the 31st International Conference on Machine Learning (2014)), 710-718
[54] Roth, D., On the hardness of approximate reasoning, Artif. Intell., 82, 1, 273-302 (1996) · Zbl 1506.68143
[55] Sang, T.; Beame, P.; Kautz, H., Performing Bayesian inference by weighted model counting, (Proceedings of the 20th AAAI Conference (2005)), 475-481
[56] Sanner, S.; McAllester, D. A., Affine algebraic decision diagrams and their application to structured probabilistic inference, (Proceedings of the 19th International Joint Conference on Artificial Intelligence (2005)), 1384-1390
[57] Sguerra, B. M.; Cozman, F. G., Image classification using sum-product networks for autonomous flight of micro aerial vehicles, (Proceedings of the Fifth Brazilian Conference on Intelligent Systems (2016)), 139-144
[58] Sousa, S. G.; Santos, N. C.; Israelian, G.; Mayor, M.; Monteiro, M. J.P. F.G., A new code for automatic determination of equivalent widths: automatic routine for line equivalent widths in stellar spectra (ARES), Astron. Astrophys., 469, 783-791 (2007)
[59] Vergari, A.; Di Mauro, N.; Esposito, F., Simplifying, regularizing and strengthening sum-product network structure learning, (Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2015)), 343-358
[60] Vomlel, J., Exploiting functional dependence in Bayesian network inference, (Proceedings of the 18th Conference in Uncertainty in Artificial Intelligence (2002)), 528-535
[61] Vomlel, J.; Tichavský, P., Probabilistic inference with noisy-threshold models based on a CP tensor decomposition, Int. J. Approx. Reason., 55, 4, 1072-1092 (2014) · Zbl 1390.68681
[62] Wainwright, M. J.; Jordan, M. I., Graphical models, exponential families, and variational inference, Found. Trends Mach. Learn., 1, 1-305 (2008) · Zbl 1193.62107
[63] Walley, P., Statistical Reasoning with Imprecise Probabilities (1991), Chapman and Hall: Chapman and Hall London · Zbl 0732.62004
[64] Yedidia, J. S.; Freeman, W. T.; Weiss, Y., Constructing free-energy approximations and generalized belief propagation algorithms, IEEE Trans. Inf. Theory, 51, 7, 2282-2312 (2005) · Zbl 1283.94023
[65] Zaffalon, M., The naive credal classifier, J. Stat. Plan. Inference, 105, 1, 5-21 (2002) · Zbl 0992.62057
[66] Zaffalon, M.; Corani, G.; Mauá, D. D., Evaluating credal classifiers by utility-discounted predictive accuracy, Int. J. Approx. Reason., 53, 8, 1282-1301 (2012) · Zbl 1266.62042
[67] Zhang, N. L.; Poole, D., Exploiting causal independence in Bayesian network inference, J. Artif. Intell. Res., 5, 301-328 (1996) · Zbl 0900.68384
[68] Zhao, H.; Adel, T.; Gordon, G.; Amos, B., Collapsed variational inference for sum-product networks, (Proceedings of the 33rd International Conference on Machine Learning. Proceedings of the 33rd International Conference on Machine Learning, Proc. Mach. Learn. Res., vol. 48 (2016)), 1310-1318
[69] Zhao, H.; Melibari, M.; Poupart, P., On the relationship between sum-product networks and Bayesian networks, (Proceedings of the 32nd International Conference on Machine Learning (2015)), 116-124
[70] Zheng, K.; Pronobis, A.; Rao, R. P.N., Learning graph-structured sum-product networks for probabilistic semantic maps, (Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)), 4547-4555
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.