×

A probabilistic approximate logic for neuro-symbolic learning and reasoning. (English) Zbl 1523.68095

Summary: As witnessed by recent advances in deep learning technologies, neural network models of very high complexity have been successfully applied in many data-rich domains. Challenges remain, however, if the amount of training data is severely limited, which is often the case due to the cost of acquiring such data or due to interest in systems that are constantly evolving thereby imposing natural limits on how much data can be collected. The core hypothesis explored in this paper is that data (to some degree) can be substituted by domain knowledge, not only addressing the limited data problem but also offering potential improvements in data-rich settings. For the representation of suitable domain theories, we propose Probabilistic Approximate Logic (PALO) to deal with the natural uncertainty associated with such representations and also to serve as a foundation for a new class of neuro-symbolic architectures, in which both neural and symbolic computations can be peacefully and synergistically integrated. Utilizing TensorFlow and Maude as neural and symbolic frameworks, respectively, we discuss our prototypical implementation of PALO in what we call the Logical Imagination Engine (LIME). By means of a small toy example, we convey a glimpse of its capabilities, but we also briefly discuss some real-world applications and how it may serve as a prototypical framework to explore a broader range of neuro-symbolic strategies in the future.

MSC:

68T27 Logic in artificial intelligence
03B48 Probability and inductive logic
68T07 Artificial neural networks and deep learning
68T37 Reasoning under uncertainty in the context of artificial intelligence
Full Text: DOI

References:

[1] Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; Kudlur, M.; Levenberg, J.; Monga, R.; Moore, S.; Murray, D. G.; Steiner, B.; Tucker, P.; Vasudevan, V.; Warden, P.; Wicke, M.; Yu, Y.; Zheng, X., Tensorflow: a system for large-scale machine learning, (Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16 (2016), USENIX Association: USENIX Association Berkeley, CA, USA), 265-283
[2] Bacchus, F., Representing and Reasoning with Probabilistic Knowledge: A Logical Approach to Probabilities (1990), MIT Press: MIT Press Cambridge, MA, USA
[3] Bach, S. H.; Broecheler, M.; Huang, B.; Getoor, L., Hinge-loss Markov random fields and probabilistic soft logic, J. Mach. Learn. Res., 18, 1, 3846-3912 (Jan. 2017) · Zbl 1435.68252
[4] Badreddine, S.; d’Avila Garcez, A.; Serafini, L.; Spranger, M., Logic tensor networks (2021), CoRR
[5] Bahri, Y.; Kadmon, J.; Pennington, J.; Schoenholz, S. S.; Sohl-Dickstein, J.; Ganguli, S., Statistical mechanics of deep learning, Annu. Rev. Condens. Matter Phys., 11, 1, 501-528 (2020)
[6] Besold, T. R.; d’Avila Garcez, A. S.; Bader, S.; Bowman, H.; Domingos, P. M.; Hitzler, P.; Kühnberger, K.-U.; Lamb, L. C.; Lowd, D.; Lima, P. M.V.; de Penning, L.; Pinkas, G.; Poon, H.; Zaverucha, G., Neural-symbolic learning and reasoning: a survey and interpretation (2017), CoRR
[7] Bishop, C. M., Pattern Recognition and Machine Learning (Information Science and Statistics) (2007), Springer
[8] Bouhoula, A.; Jouannaud, J.-P.; Meseguer, J., Specification and proof in membership equational logic, (Bidoit, M.; Dauchet, M., TAPSOFT’97: Theory and Practice of Software Development, 7th International Joint Conference CAAP/FASE. TAPSOFT’97: Theory and Practice of Software Development, 7th International Joint Conference CAAP/FASE, Lille, France, April 14-18, 1997, Proceedings. TAPSOFT’97: Theory and Practice of Software Development, 7th International Joint Conference CAAP/FASE. TAPSOFT’97: Theory and Practice of Software Development, 7th International Joint Conference CAAP/FASE, Lille, France, April 14-18, 1997, Proceedings, Lecture Notes in Computer Science, vol. 1214 (1997), Springer), 67-92 · Zbl 0938.68057
[9] Bundy, A., Incidence calculus: a mechanism for probabilistic reasoning, J. Autom. Reason., 1, 3, 263-283 (Jan. 1985) · Zbl 0615.68067
[10] Chollet, F., Keras (2015)
[11] Choromanska, A.; Henaff, M.; Mathieu, M.; Arous, G. B.; LeCun, Y., The loss surfaces of multilayer networks, (Lebanon, G.; Vishwanathan, S. V.N., Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, San Diego, California, USA, 09-12 May 2015. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, San Diego, California, USA, 09-12 May 2015, Proceedings of Machine Learning Research, vol. 38 (2015), PMLR), 192-204
[12] Clavel, M.; Durán, F.; Eker, S.; Lincoln, P.; Martí-Oliet, N.; Meseguer, J.; Talcott, C., All About Maude - a High-Performance Logical Framework: How to Specify, Program and Verify Systems in Rewriting Logic (2007), Springer-Verlag: Springer-Verlag Berlin, Heidelberg · Zbl 1115.68046
[13] Dauphin, Y. N.; Pascanu, R.; Gulcehre, C.; Cho, K.; Ganguli, S.; Bengio, Y., Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, (Ghahramani, Z.; Welling, M.; Cortes, C.; Lawrence, N.; Weinberger, K. Q., Advances in Neural Information Processing Systems, vol. 27 (2014), Curran Associates, Inc.), 2933-2941
[14] De Raedt, L.; Kimmig, A.; Toivonen, H., Problog: a probabilistic Prolog and its application in link discovery, (Proceedings of the 20th International Joint Conference on Artifical Intelligence. Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI’07 (2007), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA), 2468-2473
[15] Erhan, D.; Bengio, Y.; Courville, A. C.; Manzagol, P.-A.; Vincent, P.; Bengio, S., Why does unsupervised pre-training help deep learning?, J. Mach. Learn. Res., 11, 625-660 (2010) · Zbl 1242.68219
[16] Esteva, F.; Godo, L., Putting together Lukasiewicz and Product logics, Mathw. Soft Comput., 6, 2-3, 219-234 (1999) · Zbl 0953.03030
[17] França, M. V.M.; Zaverucha, G.; d’Avila Garcez, A. S., Fast relational learning using bottom clause propositionalization with artificial neural networks, Mach. Learn., 94, 1, 81-104 (2014)
[18] Gaines, B., Fuzzy and probability uncertainty logics, Inf. Control, 38, 2, 154-169 (1978) · Zbl 0391.03015
[19] Gens, R.; Domingos, P. M., Deep symmetry networks, (Ghahramani, Z.; Welling, M.; Cortes, C.; Lawrence, N. D.; Weinberger, K. Q., Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada (2014)), 2537-2545
[20] Ghosh, S.; Steiner, W.; Denker, G.; Lincoln, P., Probabilistic modeling of failure dependencies using Markov logic networks, (2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing (Dec 2013)), 162-171
[21] Goodfellow, I. J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y., Generative adversarial nets, (Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14 (2014), MIT Press: MIT Press Cambridge, MA, USA), 2672-2680
[22] Graves, A.; Bellemare, M. G.; Menick, J.; Munos, R.; Kavukcuoglu, K., Automated curriculum learning for neural networks, (Proceedings of the 34th International Conference on Machine Learning. Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 (2017)), 1311-1320
[23] Grefenstette, E., Towards a formal distributional semantics: simulating logical calculi with tensors, (Diab, M. T.; Baldwin, T.; Baroni, M., Proceedings of the Second Joint Conference on Lexical and Computational Semantics. Proceedings of the Second Joint Conference on Lexical and Computational Semantics, *SEM 2013, June 13-14, 2013, Atlanta, Georgia, USA (2013), Association for Computational Linguistics), 1-10 · Zbl 1301.00062
[24] Guha, R. V., Towards a model theory for distributed representations (2014), CoRR
[25] Haaren, J.; Broeck, G.; Meert, W.; Davis, J., Lifted generative learning of Markov logic networks, Mach. Learn., 103, 1, 27-55 (Apr. 2016) · Zbl 1357.68187
[26] Halpern, J. Y., An analysis of first-order logics of probability, (Proceedings of the 11th International Joint Conference on Artificial Intelligence - Volume 2. Proceedings of the 11th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI’89 (1989), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA), 1375-1381 · Zbl 0719.68060
[27] Huisman, M.; van Rijn, J. N.; Plaat, A., A survey of deep meta-learning (2020)
[28] Hájek, P.; Godo, L.; Esteva, F., A complete many-valued logic with product conjunction, Arch. Math. Log., 35, 191-208 (1996) · Zbl 0848.03005
[29] Jin, C.; Liu, L. T.; Ge, R.; Jordan, M. I., On the local minima of the empirical risk, (Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; Garnett, R., Advances in Neural Information Processing Systems, vol. 31 (2018), Curran Associates, Inc.), 4896-4905
[30] Kawaguchi, K., Deep learning without poor local minima, (Lee, D.; Sugiyama, M.; Luxburg, U.; Guyon, I.; Garnett, R., Advances in Neural Information Processing Systems, vol. 29 (2016), Curran Associates, Inc.), 586-594
[31] Kingma, D. P.; Adam, J. Ba., A method for stochastic optimization (2014), CoRR
[32] Kingma, D. P.; Welling, M., Auto-encoding variational Bayes, (2nd International Conference on Learning Representations. 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings (2014))
[33] LeCun, Y.; Bengio, Y.; Hinton, G. E., Deep learning, Nature, 521, 7553, 436-444 (2015)
[34] LeCun, Y.; Cortes, C., The MNIST handwritten digit database (2010)
[35] Lee, J.; Wang, Y., On the semantic relationship between probabilistic soft logic and Markov logic (2016), CoRR
[36] Li, C.; Chen, C.; Carlson, D.; Carin, L., Preconditioned stochastic gradient Langevin dynamics for deep neural networks, (Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16 (2016), AAAI Press), 1788-1794
[37] Li, H.; Xu, Z.; Taylor, G.; Studer, C.; Goldstein, T., Visualizing the loss landscape of neural nets, (Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; Garnett, R., Advances in Neural Information Processing Systems, vol. 31 (2018), Curran Associates, Inc.), 6389-6399
[38] Ma, Y.-A.; Chen, T.; Fox, E., A complete recipe for stochastic gradient MCMC, (Cortes, C.; Lawrence, N.; Lee, D.; Sugiyama, M.; Garnett, R., Advances in Neural Information Processing Systems, vol. 28 (2015), Curran Associates, Inc.), 2917-2925
[39] Mandt, S.; Hoffman, M. D.; Blei, D. M., Stochastic gradient descent as approximate Bayesian inference, J. Mach. Learn. Res., 18, 1, 4873-4907 (Jan. 2017) · Zbl 1442.62055
[40] Manhaeve, R.; Dumancic, S.; Kimmig, A.; Demeester, T.; Raedt, L. D., Deepproblog: neural probabilistic logic programming, (Beuls, K.; Bogaerts, B.; Bontempi, G.; Geurts, P.; Harley, N.; Lebichot, B.; Lenaerts, T.; Louppe, G.; Eecke, P. V., Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019). Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019), Brussels, Belgium, November 6-8, 2019. Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019). Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019), Brussels, Belgium, November 6-8, 2019, CEUR Workshop Proceedings, vol. 2491 (2019))
[41] Marra, G.; Diligenti, M.; Giannini, F.; Gori, M.; Maggini, M., Relational neural machines, (Giacomo, G. D.; Catalá, A.; Dilkina, B.; Milano, M.; Barro, S.; Bugarín, A.; Lang, J., ECAI 2020 - 24th European Conference on Artificial Intelligence. ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29, September 8, 2020, Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020). ECAI 2020 - 24th European Conference on Artificial Intelligence. ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29, September 8, 2020, Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020), Frontiers in Artificial Intelligence and Applications, vol. 325 (2020), IOS Press), 1340-1347 · Zbl 1456.68006
[42] Martí-Oliet, N.; Meseguer, J., Rewriting logic as a logical and semantic framework, Electron. Notes Theor. Comput. Sci., 4, 190-225 (1996) · Zbl 0912.68096
[43] Meseguer, J., Conditional rewriting logic as a unified model of concurrency, Theor. Comput. Sci., 96, 1, 73-155 (Apr. 1992) · Zbl 0758.68043
[44] Poon, H.; Domingos, P., Sum-product networks: a new deep architecture, (Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence. Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, UAI’11 (2011), AUAI Press: AUAI Press Arlington, Virginia, United States), 337-346
[45] Probabilistic Consistency Engine (PCE)
[46] Recursion Pharmaceuticals, Cellsignal: disentangling biological signal from experimental noise in cellular images (2019)
[47] Recursion Pharmaceuticals, Recursion cellular image classification (2019)
[48] Reichenbach, H., The Theory of Probability (1949), University of California Press · Zbl 0038.28604
[49] Rendle, S., Factorization machines, (Proceedings of the 2010 IEEE International Conference on Data Mining. Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM ’10 (2010), IEEE Computer Society: IEEE Computer Society Washington, DC, USA), 995-1000
[50] Richardson, M.; Domingos, P., Markov logic networks, Mach. Learn., 62, 1-2, 107-136 (Feb. 2006) · Zbl 1470.68221
[51] Rocktäschel, T.; Riedel, S., End-to-end differentiable proving, (Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Advances in Neural Information Processing Systems, vol. 30 (2017), Curran Associates, Inc.)
[52] Rocktaschel, T.; Singh, S.; Bosnjak, M.; Riedel, S., Low-dimensional embeddings of logic, (ACL 2014 Workshop on Semantic Parsing (SP14) (2014))
[53] Rocktäschel, T.; Singh, S.; Riedel, S., Injecting logical background knowledge into embeddings for relation extraction, (Mihalcea, R.; Chai, J. Y.; Sarkar, A., NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31 - June 5, 2015 (2015), The Association for Computational Linguistics), 1119-1129
[54] Ruder, S., An overview of gradient descent optimization algorithms (2016), CoRR
[55] Serafini, L.; d’Avila Garcez, A. S., Logic tensor networks: deep learning and logical reasoning from data and knowledge (2016), CoRR · Zbl 1430.68317
[56] Serafini, L.; Donadello, I.; d’Avila Garcez, A., Learning and reasoning in logic tensor networks: theory and application to semantic image interpretation, (Proceedings of the Symposium on Applied Computing. Proceedings of the Symposium on Applied Computing, SAC ’17 (2017), ACM: ACM New York, NY, USA), 125-130
[57] Shorten, C.; Khoshgoftaar, T. M., A survey on image data augmentation for deep learning, Big Data, 6, 60 (2019)
[58] Sikka, K.; Silberfarb, A.; Byrnes, J.; Sur, I.; Chow, E.; Divakaran, A.; Rohwer, R., Deep adaptive semantic logic (DASL): compiling declarative knowledge into deep neural networks (2020), CoRR
[59] Socher, R.; Chen, D.; Manning, C. D.; Ng, A. Y., Reasoning with neural tensor networks for knowledge base completion, (Burges, C. J.C.; Bottou, L.; Ghahramani, Z.; Weinberger, K. Q., Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Proceedings of a Meeting Held December 5-8, 2013, Lake Tahoe, Nevada, United States (2013)), 926-934
[60] Stehr, M.-O., CINNI - a generic calculus of explicit substitutions and its application to λ-, ς- and π-calculi, Electron. Notes Theor. Comput. Sci., 36, 70-92 (2000)
[61] Stehr, M.-O.; Avar, P.; Korte, A. R.; Parvin, L.; Sahab, Z. J.; Bunin, D. I.; Knapp, M.; Nishita, D.; Poggio, A.; Talcott, C. L.; Davis, B. M.; Morton, C. A.; Sevinsky, C. J.; Zavodszky, M. I.; Vertes, A., Learning causality: synthesis of large-scale causal networks from high-dimensional time series data (2019), CoRR
[62] Stehr, M.-O.; Kim, M.; Talcott, C. L.; Knapp, M.; Vertes, A., Probabilistic approximate logic and its implementation in the logical imagination engine (2019), CoRR
[63] Stehr, M.-O.; Meseguer, J., Pure type systems in rewriting logic: Specifying typed higher-order languages in a first-order logical framework, (Owe, O.; Krogdahl, S.; Lyche, T., From Object-Orientation to Formal Methods, Essays in Memory of Ole-Johan Dahl. From Object-Orientation to Formal Methods, Essays in Memory of Ole-Johan Dahl, Lecture Notes in Computer Science, vol. 2635 (2004), Springer), 334-375 · Zbl 1278.03066
[64] The Maude System · Zbl 1038.68559
[65] The Yices SMT Solver
[66] Vertes, A.; Arul, A.; Avar, P.; Korte, A. R.; Parvin, L.; Sahab, Z. J.; Bunin, D. I.; Knapp, M.; Nishita, D.; Poggio, A.; Stehr, M.; Talcott, C. L.; Davis, B. M.; Morton, C. A.; Sevinsky, C. J.; Zavodszky, M. I., Transcriptional response of SK-N-AS cells to methamidophos (extended abstract), (Bortolussi, L.; Sanguinetti, G., Computational Methods in Systems Biology - 17th International Conference. Computational Methods in Systems Biology - 17th International Conference, CMSB 2019, Trieste, Italy, September 18-20, 2019, Proceedings. Computational Methods in Systems Biology - 17th International Conference. Computational Methods in Systems Biology - 17th International Conference, CMSB 2019, Trieste, Italy, September 18-20, 2019, Proceedings, Lecture Notes in Computer Science, vol. 11773 (2019), Springer), 368-372
[67] Vertes, A.; Arul, A.-B.; Avar, P.; Korte, A. R.; Li, H.; Nemes, P.; Parvin, L.; Stopka, S.; Hwang, S.; Sahab, Z. J.; Zhang, L.; Bunin, D. I.; Knapp, M.; Poggio, A.; Stehr, M.-O.; Talcott, C. L.; Davis, B. M.; Dinn, S. R.; Morton, C. A.; Sevinsky, C. J.; Zavodszky, M. I., Inferring mechanism of action of an unknown compound from time series omics data, (Ceska, M.; Safránek, D., Computational Methods in Systems Biology - 16th International Conference, CMSB 2018. Computational Methods in Systems Biology - 16th International Conference, CMSB 2018, Brno, Czech Republic, September 12-14, 2018, Proceedings. Computational Methods in Systems Biology - 16th International Conference, CMSB 2018. Computational Methods in Systems Biology - 16th International Conference, CMSB 2018, Brno, Czech Republic, September 12-14, 2018, Proceedings, Lecture Notes in Computer Science, vol. 11095 (2018), Springer), 238-255 · Zbl 1397.92236
[68] Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.-A., Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res., 11, 3371-3408 (2010) · Zbl 1242.68256
[69] Wang, H.; Yeung, D.-Y., A survey on Bayesian deep learning, ACM Comput. Surv., 53, 5 (Sept. 2020)
[70] Welling, M.; Teh, Y. W., Bayesian learning via stochastic gradient Langevin dynamics, (Proceedings of the 28th International Conference on International Conference on Machine Learning. Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11 (2011), Omnipress: Omnipress USA), 681-688
[71] Xie, J.; Girshick, R.; Farhadi, A., Unsupervised deep embedding for clustering analysis, (Balcan, M. F.; Weinberger, K. Q., Proceedings of the 33rd International Conference on Machine Learning. Proceedings of the 33rd International Conference on Machine Learning, New York, New York, USA, 20-22 Jun 2016. Proceedings of the 33rd International Conference on Machine Learning. Proceedings of the 33rd International Conference on Machine Learning, New York, New York, USA, 20-22 Jun 2016, Proceedings of Machine Learning Research, vol. 48 (2016), PMLR), 478-487
[72] Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O., Understanding deep learning requires rethinking generalization, (ICLR (2017))
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.