×

A survey of model learning techniques for recurrent neural networks. (English) Zbl 1524.68309

Jansen, Nils (ed.) et al., A journey from process algebra via timed automata to model learning. Essays dedicated to Frits Vaandrager on the occasion of his 60th birthday. Cham: Springer. Lect. Notes Comput. Sci. 13560, 81-97 (2022).
Summary: Ensuring the correctness and reliability of deep neural networks is a challenge. Suitable formal analysis and verification techniques have yet to be developed. One promising approach towards this goal is model learning, which seeks to derive surrogate models of the underlying neural network in a model class that permits sophisticated analysis and verification techniques. This paper surveys several existing model learning approaches that infer finite-state automata and context-free grammars from Recurrent Neural Networks, an essential class of deep neural networks for sequential data. Most of these methods rely on Angluin’s approach for learning finite automata but implement different ways of checking the equivalence of a learned model with the neural network. Our paper presents these distinct techniques in a unified language and discusses their strengths and weaknesses. Furthermore, we survey model learning techniques that follow a novel trend in explainable artificial intelligence and learn models in the form of formal grammars.
For the entire collection see [Zbl 1515.68037].

MSC:

68T07 Artificial neural networks and deep learning
68Q45 Formal languages and automata
Full Text: DOI

References:

[1] Alur, R.; Kumar, V.; Madhusudan, P.; Viswanathan, M.; Caires, L.; Italiano, GF; Monteiro, L.; Palamidessi, C.; Yung, M., Congruences for visibly pushdown languages, Automata, Languages and Programming, 1102-1114 (2005), Heidelberg: Springer, Heidelberg · Zbl 1085.68079 · doi:10.1007/11523468_89
[2] Alur, R., Madhusudan, P.: Adding nesting structure to words. J. ACM 56(3), 16:1-16:43 (2009). doi:10.1145/1516512.1516518 · Zbl 1325.68138
[3] Angluin, D., Learning regular sets from queries and counterexamples, Inf. Comput., 75, 2, 87-106 (1987) · Zbl 0636.68112 · doi:10.1016/0890-5401(87)90052-6
[4] Ayache, S., Eyraud, R., Goudian, N.: Explaining black boxes on sequential data using weighted automata. In: Unold, O., Dyrka, W., Wieczorek, W. (eds.) Proceedings of the 14th International Conference on Grammatical Inference, ICGI 2018, Wrocław, Poland, 5-7 September 2018. Proceedings of Machine Learning Research, vol. 93, pp. 81-103. PMLR (2018). http://proceedings.mlr.press/v93/ayache19a.html
[5] Barbot, B., Bollig, B., Finkel, A., Haddad, S., Khmelnitsky, I., Leucker, M., Neider, D., Roy, R., Ye, L.: Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. In: Chandlee, J., Eyraud, R., Heinz, J., Jardine, A., van Zaanen, M. (eds.) Proceedings of the Fifteenth International Conference on Grammatical Inference. Proceedings of Machine Learning Research, vol. 153, pp. 113-129. PMLR, 23-27 August 2021. https://proceedings.mlr.press/v153/barbot21a.html
[6] Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Haussler, D. (ed.) Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, 27-29 July 1992, pp. 144-152. ACM (1992). doi:10.1145/130385.130401
[7] Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, 25-29 October 2014, Doha, Qatar, A Meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1724-1734. ACL (2014). doi:10.3115/v1/d14-1179
[8] Drewes, F.; Högberg, J., Query learning of regular tree languages: How to avoid dead states, Theory Comput. Syst., 40, 2, 163-185 (2007) · Zbl 1107.68049 · doi:10.1007/s00224-005-1233-3
[9] Eisner, C.; Fisman, D., A Practical Introduction to PSL (2006), Heidelberg: Series on Integrated Circuits and Systems. Springer, Heidelberg · doi:10.1007/978-0-387-36123-9
[10] Garavel, H.; Beek, MH; Pol, J.; ter Beek, MH; Ničković, D., The 2020 expert survey on formal methods, Formal Methods for Industrial Critical Systems, 3-69 (2020), Cham: Springer, Cham · doi:10.1007/978-3-030-58298-2_1
[11] Hart, PE; Nilsson, NJ; Raphael, B., A formal basis for the heuristic determination of minimum cost paths, IEEE Trans. Syst. Sci. Cybern., 4, 2, 100-107 (1968) · doi:10.1109/TSSC.1968.300136
[12] de la Higuera, C., A bibliographical study of grammatical inference, Pattern Recognit., 38, 9, 1332-1348 (2005) · doi:10.1016/j.patcog.2005.01.003
[13] de la Higuera, C., Grammatical Inference (2010), Cambridge: Cambridge University Press, Cambridge · Zbl 1227.68112 · doi:10.1017/CBO9781139194655
[14] Hochreiter, S.; Schmidhuber, J., Long short-term memory, Neural Comput., 9, 8, 1735-1780 (1997) · doi:10.1162/neco.1997.9.8.1735
[15] Hoeffding, W., Probability inequalities for sums of bounded random variables, J. Am. Stat. Assoc., 58, 301, 13-30 (1963) · Zbl 0127.10602 · doi:10.2307/2282952
[16] Jacobsson, H., Rule extraction from recurrent neural networks: A taxonomy and review, Neural Comput., 17, 6, 1223-1263 (2005) · Zbl 1087.68089 · doi:10.1162/0899766053630350
[17] Kearns, MJ; Vazirani, UV, An Introduction to Computational Learning Theory (1994), Cambridge: MIT Press, Cambridge · doi:10.7551/mitpress/3897.001.0001
[18] Khmelnitsky, I.; Hou, Z.; Ganesh, V., Property-directed verification and robustness certification of recurrent neural networks, Automated Technology for Verification and Analysis, 364-380 (2021), Cham: Springer, Cham · Zbl 1497.68311 · doi:10.1007/978-3-030-88885-5_24
[19] Legay, A.; Lukina, A.; Traonouez, LM; Yang, J.; Smolka, SA; Grosu, R.; Steffen, B.; Woeginger, G., Statistical model checking, Computing and Software Science, 478-504 (2019), Cham: Springer, Cham · Zbl 1482.68142 · doi:10.1007/978-3-319-91908-9_23
[20] Leucker, M.; Carvalho, G.; Stolz, V., Formal verification of neural networks?, Formal Methods: Foundations and Applications, 3-7 (2020), Cham: Springer, Cham · doi:10.1007/978-3-030-63882-5_1
[21] Liu, B.: Sentiment analysis and subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn., pp. 627-666. Chapman and Hall/CRC (2010). doi:10.1201/9781420085938-c26
[22] Mayr, F.; Visca, R.; Yovine, S.; Holzinger, A.; Kieseberg, P.; Tjoa, AM; Weippl, E., On-the-fly black-box probably approximately correct checking of recurrent neural networks, Machine Learning and Knowledge Extraction, 343-363 (2020), Cham: Springer, Cham · doi:10.1007/978-3-030-57321-8_19
[23] Mayr, F.; Yovine, S.; Holzinger, A.; Kieseberg, P.; Tjoa, AM; Weippl, E., Regular inference on artificial neural networks, Machine Learning and Knowledge Extraction, 350-369 (2018), Cham: Springer, Cham · doi:10.1007/978-3-319-99740-7_25
[24] Mayr, F.; Yovine, S.; Visca, R., Property checking with interpretable error characterization for recurrent neural networks, Mach. Learn. Knowl. Extr., 3, 1, 205-227 (2021) · doi:10.3390/make3010010
[25] Okudono, T., Waga, M., Sekiyama, T., Hasuo, I.: Weighted automata extraction from recurrent neural networks via regression on state spaces. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7-12 February 2020, pp. 5306-5314. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/5977
[26] Omlin, CW; Giles, CL, Extraction of rules from discrete-time recurrent neural networks, Neural Netw., 9, 1, 41-52 (1996) · doi:10.1016/0893-6080(95)00086-0
[27] Pnueli, A.: The temporal logic of programs. In: 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October-1 November 1977, pp. 46-57. IEEE Computer Society (1977). doi:10.1109/SFCS.1977.32
[28] Rivest, RL; Schapire, RE, Inference of finite automata using homing sequences, Inf. Comput., 103, 2, 299-347 (1993) · Zbl 0786.68082 · doi:10.1006/inco.1993.1021
[29] Seshia, SA; Lahiri, SK; Wang, C., Formal specification for deep neural networks, Automated Technology for Verification and Analysis, 20-34 (2018), Cham: Springer, Cham · Zbl 1517.68345 · doi:10.1007/978-3-030-01090-4_2
[30] Vaandrager, FW, Model learning, Commun. ACM, 60, 2, 86-95 (2017) · doi:10.1145/2967606
[31] Valiant, LG, A theory of the learnable, Commun. ACM, 27, 11, 1134-1142 (1984) · Zbl 0587.68077 · doi:10.1145/1968.1972
[32] Weiss, G., Goldberg, Y., Yahav, E.: Extracting automata from recurrent neural networks using queries and counterexamples. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10-15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 5244-5253. PMLR (2018). http://proceedings.mlr.press/v80/weiss18a.html
[33] Weiss, G., Goldberg, Y., Yahav, E.: On the practical computational power of finite precision RNNs for language recognition. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15-20 July 2018, Volume 2: Short Papers, pp. 740-745. Association for Computational Linguistics (2018). doi:10.18653/v1/P18-2117
[34] Weiss, G., Goldberg, Y., Yahav, E.: Learning deterministic weighted automata with queries and counterexamples. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pp. 8558-8569 (2019). https://proceedings.neurips.cc/paper/2019/hash/d3f93e7766e8e1b7ef66dfdd9a8be93b-Abstract.html
[35] Xie, X., Kersting, K., Neider, D.: Neuro-symbolic verification of deep neural networks. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022. ijcai.org (2022, to appear). doi:10.48550/arXiv.2203.00938
[36] Yellin, DM; Weiss, G.; Groote, JF; Larsen, KG, Synthesizing context-free grammars from recurrent neural networks, Tools and Algorithms for the Construction and Analysis of Systems, 351-369 (2021), Cham: Springer, Cham · Zbl 1467.68077 · doi:10.1007/978-3-030-72016-2_19
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.