×

A domain-theoretic framework for robustness analysis of neural networks. (English) Zbl 07813326

Summary: A domain-theoretic framework is presented for validated robustness analysis of neural networks. First, global robustness of a general class of networks is analyzed. Then, using the fact that Edalat’s domain-theoretic \(L\)-derivative coincides with Clarke’s generalized gradient, the framework is extended for attack-agnostic local robustness analysis. The proposed framework is ideal for designing algorithms which are correct by construction. This claim is exemplified by developing a validated algorithm for estimation of Lipschitz constant of feedforward regressors. The completeness of the algorithm is proved over differentiable networks and also over general position ReLU networks. Computability results are obtained within the framework of effectively given domains. Using the proposed domain model, differentiable and non-differentiable networks can be analyzed uniformly. The validated algorithm is implemented using arbitrary-precision interval arithmetic, and the results of some experiments are presented. The software implementation is truly validated, as it handles floating-point errors as well.

MSC:

68-XX Computer science

Software:

SHRAD; AI2; MPFI; POPQORN; LipBaB

References:

[1] Abramsky, S. and Jung, A. (1994). Domain theory. In: Abramsky, S., Gabbay, D. M. and Maibaum, T. S. E. (eds.) Handbook of Logic in Computer Science, vol. 3, Oxford, Clarendon Press, 1-168.
[2] Abramsky, S. (1990). Abstract interpretation, logical relations, and Kan extensions. Journal of Logic and Computation1 (1) 5-40. · Zbl 0727.03020
[3] Albiac, F. and Kalton, N. J. (2006). Topics in Banach Space Theory, New York, Springer. · Zbl 1094.46002
[4] Araujo, A., Négrevergne, B., Chevaleyre, Y. and Atif, J. (2021). On Lipschitz regularization of convolutional layers using Toeplitz matrix theory. In: AAAI.
[5] Bartlett, P. L., Long, P. M., Lugosi, G. and Tsigler, A. (2020). Benign overfitting in linear regression. Proceedings of the National Academy of Sciences 117 (48) 30063-30070. doi: doi:10.1073/pnas.1907378117. · Zbl 1485.62085
[6] Belkin, M. (2021). Fit without fear: Remarkable mathematical phenomena of deep learning through the prism of interpolation. Acta Numerica30203-248. doi: doi:10.1017/S0962492921000039.
[7] Bhowmick, A., D’Souza, M. and Raghavan, G. S. (2021). LipBaB: Computing exact Lipschitz constant of ReLU networks. In: Farkaš, I., Masulli, P., Otte, S. and Wermter, S. (eds.) Artificial Neural Networks and Machine Learning - ICANN 2021, Cham, Springer International Publishing, 151-162. doi: doi:10.1007/978-3-030-86380-7_13.
[8] Burn, G. L., Hankin, C. and Abramsky, S. (1986). Strictness analysis for higher-order functions. Science of Computer Programming7249-278. · Zbl 0603.68013
[9] Carlini, N. and Wagner, D. A. (2017). Towards rvaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, IEEE Computer Society, 39-57. doi: doi:10.1109/SP.2017.49.
[10] Chaudhuri, S., Gulwani, S. and Lublinerman, R. (2012). Continuity and robustness of programs. Communications of the ACM55 (8) 107-115.
[11] Chaudhuri, S., Gulwani, S., Lublinerman, R. and Navidpour, S. (2011). Proving programs robust. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ESEC/FSE’11, Association for Computing Machinery, 102-112.
[12] Chen, T., Lasserre, J. B., Magron, V. and Pauwels, E. (2020). Semialgebraic optimization for Lipschitz constants of ReLU networks. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. and Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
[13] Clarke, F. H., Ledyaev, Y. S., Stern, R. J. and Wolenski, R. R. (1998). Nonsmooth Analysis and Control Theory, New York, Springer. · Zbl 1047.49500
[14] Clarke, F. H. (1990). Optimization and Nonsmooth Analysis, Philadelphia, Classics in Applied Mathematics, Society for Industrial and Applied Mathematics. · Zbl 0696.49002
[15] Cousot, P. and Cousot, R. (1977). Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages. POPL’77, 238-252.
[16] Di Gianantonio, P. (1996). Real number computability and domain theory. Information and Computation127 (1) 11-25. · Zbl 0856.68080
[17] Di Gianantonio, P. and Edalat, A. (2013). A language for differentiable functions. In: Pfenning, F. (ed.) Foundations of Software Science and Computation Structures, Berlin, Heidelberg, Springer Berlin Heidelberg, 337-352. doi: doi:10.1007/978-3-642-37075-5_22. · Zbl 1260.68073
[18] Dietterich, T. G. (2017). Steps toward robust artificial intelligence. AI Magazine38 (3) 3-24. doi: doi:10.1609/aimag.v38i3.2756.
[19] Dragičević, T., Wheeler, P. and Blaabjerg, F. (2019). Artificial intelligence aided automated design for reliability of power electronic systems. IEEE Transactions on Power Electronics34 (8) 7161-7171. doi: doi:10.1109/TPEL.2018.2883947.
[20] Edalat, A. (1995a). Domain theory in learning processes. Electronic Notes in Theoretical Computer Science 1. MFPS XI, Mathematical Foundations of Programming Semantics, Eleventh Annual Conference, 114-131.
[21] Edalat, A. (1995b). Dynamical systems, measures and fractals via domain theory. Information and Computation120 (1) 32-48. · Zbl 0834.58029
[22] Edalat, A., Lieutier, A. and Pattinson, D. (2013). A computational model for multi-variable differential calculus. Information and Computation22423-45.
[23] Edalat, A. and Pattinson, D. (2007). A domain-theoretic account of Picard’s theorem. LMS Journal of Computation and Mathematics1083-118. · Zbl 1112.65065
[24] Edalat, A. and Pattinson, D. (2005). Inverse and implicit functions in domain theory. In: Proceedings of the 20th Annual IEEE Symposium on Logic in Computer Science, IEEE Computer Society, pp. 417-426.
[25] Edalat, A. and Sünderhauf, P. (1999). A domain theoretic approach to computability on the real line. Theoretical Computer Science21073-98. · Zbl 0912.68031
[26] Edalat, A. (2008). A continuous derivative for real-valued functions. In: New Computational Paradigms: Changing Conceptions of What is Computable, Springer, 493-519. · Zbl 1149.46035
[27] Edalat, A. (1997). Domains for computation in mathematics, physics and exact real arithmetic. Bulletin of Symbolic Logic3 (4) 401-452. · Zbl 0946.03055
[28] Edalat, A. and Escardó, M. H. (2000). Integration in real PCF. Information and Computation160 (1-2) 128-166. · Zbl 1005.03035
[29] Edalat, A., Farjudian, A., Mohammadian, M. and Pattinson, D. (2020). Domain theoretic second-order Euler’s method for solving initial value problems. Electronic Notes in Theoretical Computer Science 352. The 36th Mathematical Foundations of Programming Semantics Conference, 2020 (MFPS 2020), Paris, France, 105-128. · Zbl 07516373
[30] Edalat, A. and Lieutier, A. (2004). Domain theory and differential calculus (functions of one variable). Mathematical Structures in Computer Science14 (6) 771-802. · Zbl 1062.03037
[31] Edalat, A. and Pattinson, D. (2007). Denotational semantics of hybrid automata. The Journal of Logic and Algebraic Programming73 (1) 3-21. · Zbl 1123.68056
[32] Erker, T., Escardó, M. H. and Keimel, K. (1998). The way-below relation of function spaces over semantic domains. Topology and its Applications89 (1) 61-74. doi: doi:10.1016/S0166-8641(97)00226-5. · Zbl 0922.06009
[33] Escardó, M. H. (1996). PCF extended with real numbers. Theoretical Computer Science16279-5.
[34] Escardó, M. H. (1998). Properly injective spaces and function spaces. Topology and its Applications89 (1) 75-120. doi: doi:10.1016/S0166-8641(97)00225-3. · Zbl 0922.06008
[35] Farjudian, A. (2007). Shrad: A language for sequential real number computation. Theory of Computing Systems41 (1) 49-105. · Zbl 1127.68022
[36] Farjudian, A. and Moggi, E. (2022). Robustness, Scott Continuity, and Computability. arXiv:2208.12347 [cs.LO].
[37] Fazlyab, M., Robey, A., Hassani, H., Morari, M. and Pappas, G. J. (2019). Efficient and accurate estimation of Lipschitz constants for deep neural networks. In: Wallach, H. M., Larochelle, H., Beygelzimer, A., D’Alché-Buc, F., Fox, E.B. and Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, 11423-11434.
[38] Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P. Chaudhuri, S. and Vechev, M. (2018). AI2: Safety and robustness certification of neural networks with abstract interpretation. In: 2018 IEEE Symposium on Security and Privacy (SP), 3-18.
[39] Gierz, G., Hofmann, K. H., Keimel, K., Lawson, J. D., Mislove, M. W. and Scott, D. S. (2003). Continuous Lattices and Domains, Encycloedia of Mathematics and its Applications, vol. 93, New York, Cambridge University Press. · Zbl 1088.06001
[40] Gierz, G., Hofmann, K. H., Keimel, K., Lawson, J. D., Mislove, M. W. and Scott, D. S. (1980). A Compendium of Continuous Lattices, Berlin Heidelberg, Springer. · Zbl 0452.06001
[41] Goodfellow, I. J., Shlens, J. and Szegedy, C. (2015). Explaining and harnessing adversarial examples. CoRR abs/1412.6572. http://arxiv.org/abs/1412.6572.
[42] Goubault-Larrecq, J. (2013). Non-Hausdorff Topology and Domain Theory, New York, Cambridge University Press. · Zbl 1280.54002
[43] Haan, L. De and Ferreira, A. (2006). Extreme Value Theory: An Introduction, New York, Springer. · Zbl 1101.62002
[44] Hashemi, N., Ruths, J. and Fazlyab, M. (2021). Certifying incremental quadratic constraints for neural networks via convex optimization. In: Jadbabaie, A., Lygeros, J., Pappas, G. J., Parrilo, A., P., Recht, B., Tomlin, C. J. and Zeilinger, M. N. (eds.) Proceedings of the 3rd Annual Conference on Learning for Dynamics and Control, L4DC 2021, 7-8 June 2021, Virtual Event, Switzerland, Proceedings of Machine Learning Research, vol. 144, 842-853.
[45] Heaven, D. (2019). Why deep-learning AIs are so easy to fool. Nature574163-166.
[46] Hein, M. and Andriushchenko, M. (2017). Formal guarantees on the robustness of a classifier against adversarial manipulation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, Long Beach, California, USA, 2263-2273.
[47] Hertling, P. (2017). Clarke’s generalized gradient and Edalat’s L-derivative. Journal of Logic & Analysis91-21. · Zbl 1364.26034
[48] Jech, T. (2002). Set Theory, Berlin Heidelberg, Springer.
[49] Jia, K. and Rinard, M. (2021). Exploiting verified neural networks via floating point numerical error. In: Dragoi, C., Mukherjee, S., and Namjoshi, K. S. (eds.) Static Analysis - 28th International Symposium, SAS 2021, Chicago, IL, USA, October 17-19, 2021, Proceedings, Lecture Notes in Computer Science, vol. 12913, Springer, 191-205. doi: doi:10.1007/978-3-030-88806-0_9. · Zbl 1497.68309
[50] Jordan, M. and Dimakis, A. G. (2020). Exactly computing the local Lipschitz constant of ReLU networks. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.
[51] Jordan, M. and Dimakis, A. G. (2021). Provable Lipschitz certification for generative models. In: Meila, M. and Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, 5118-5126.
[52] Keimel, K. (2017). Domain theory its ramifications and interactions. Electronic Notes in Theoretical Computer Science 333 (Supplement C). The Seventh International Symposium on Domain Theory and Its Applications (ISDT), 3-16.
[53] Ko, C.-Y., Lyu, Z., Weng, L., Daniel, L., Wong, N. and Lin, D. (2019). POPQORN: Quantifying robustness of recurrent neural networks. In: Chaudhuri, K. and Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97, 3468-3477.
[54] Ko, K.-I. (1991). Complexity Theory of Real Functions, Boston, Birkhäuser. · Zbl 0791.03019
[55] Konečný, M. and Farjudian, A. (2010a). Compositional semantics of dataflow networks with query-driven communication of exact values. Journal of Universal Computer Science16 (18) 2629-2656. · Zbl 1216.68156
[56] Konečný, M. and Farjudian, A. (2010b). Semantics of query-driven communication of exact values. Journal of Universal Computer Science16 (18) 2597-2628. · Zbl 1216.68155
[57] Latorre, F., Rolland, P. and Cevher, V. (2020). Lipschitz constant estimation of Neural Networks via sparse polynomial optimization. In: International Conference on Learning Representations.
[58] Laurel, J., Yang, R., Singh, G. and Misailovic, S. (2022). A dual number abstraction for static analysis of Clarke Jacobians. Proceedings of the ACM on Programming Languages 6. doi: doi:10.1145/3498718.
[59] Lee, S., Lee, J. and Park, S. (2020). Lipschitz-certifiable training with a tight outer bound. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F., and Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, Curran Associates, Inc., 16891-16902.
[60] Mirman, M., Gehr, T. and Vechev, M. T. (2018). Differentiable abstract interpretation for provably robust neural networks. In: Dy, J. G. and Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, Proceedings of Machine Learning Research, vol. 80, PMLR, 3575-3583.
[61] Moggi, E., Farjudian, A., Duracz, A. and Taha, W. (2018). Safe & robust reachability analysis of hybrid systems. Theoretical Computer Science74775-99. · Zbl 1400.68140
[62] Moggi, E., Farjudian, A. and Taha, W. (2019a). System analysis and robustness. In: Margaria, T., Graf, S. and Larsen, K. G. (eds.) Models, Mindsets, Meta: The What, the How, and the Why Not? Essays Dedicated to Bernhard Steffen on the Occasion of His 60th Birthday, Springer International Publishing, 36-44. · Zbl 1519.68165
[63] Moggi, E., Farjudian, A. and Taha, W. (2019b). System analysis and robustness. In: Proceedings of the 20th Italian Conference on Theoretical Computer Science, ICTCS 2019, Como, Italy, September 9-11, 2019, pp. 1-7. http://ceur-ws.org/Vol-2504/paper1.pdf.
[64] Moore, R. E., Kearfott, R. B. and Cloud, M. J. (2009). Introduction to Interval Analysis, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics. · Zbl 1168.65002
[65] Muthukumar, V., Vodrahalli, K., Subramanian, V. and Sahai, A. (2020). Harmless interpolation of Noisy data in regression. IEEE Journal on Selected Areas in Information Theory1 (1) 67-83. doi: doi:10.1109/JSAIT.2020.2984716.
[66] Neumaier, A. (1993). The wrapping effect, ellipsoid arithmetic, stability and confidence regions. In: Albrecht, R., Alefeld, G., and Stetter, H. J. (eds.) Validation Numerics: Theory and Applications, Springer, 175-190. doi: doi:10.1007/978-3-7091-6918-6_14. · Zbl 0790.65035
[67] Pauli, P., Koch, A., Berberich, J., Kohler, P. and Allgöwer, F. (2022). Training robust neural networks using Lipschitz bounds. IEEE Control Systems Letters6121-126. doi: doi:10.1109/LCSYS.2021.3050444.
[68] Prabhakar, P. and Afzal, Z. R. (2019). Abstraction based output range analysis for neural networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2019), 15788-15798.
[69] Revol, N. and Rouillier, F. (2005). Motivations for an arbitrary precision interval arithmetic and the MPFI library. Reliable Computing11 (4) 275-290. · Zbl 1078.65543
[70] Rudd, K. (2013). Solving Partial Differential Equations Using Artificial Neural Networks. Phd thesis. Department of Mechanical Engineering and Material Sciences, Duke University.
[71] Rudin, W. (1991). Functional Analysis, 2nd ed., Singapore, McGraw-Hill. · Zbl 0867.46001
[72] Scott, D. (1970). Outline of a mathematical theory of computation. In: Proceedings of the Fourth Annual Princeton Conference on Information Sciences and Systems, 169-176.
[73] Serban, A., Poll, E. and Visser, J. (2020). Adversarial examples on object recognition: A comprehensive survey. ACM Computing Surveys53 (3) 1-38. doi: doi:10.1145/3398394.
[74] Simpson, A. K. (1998). Lazy functional algorithms for exact real functionals. In: Brim, L., Gruska, J. and Zlatuška, J. (eds.) Mathematical Foundations of Computer Science 1998, Springer, 456-464. · Zbl 0913.65009
[75] Singh, G., Gehr, T., Püschel, M. and Vechev, M. (2019). An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages 3 (POPL) 1-30.
[76] Smyth, M. B. (1977). Effectively given domains. Theoretical Computer Science5257-274. · Zbl 0429.03028
[77] Sotoudeh, M. and Thakur, A. V. (2020). Abstract neural networks. In: Pichardie, D. and Sighireanu, M. (eds.) Static Analysis - 27th International Symposium, SAS 2020, Virtual Event, November 18-20, 2020, Proceedings, Lecture Notes in Computer Science, vol. 12389, 65-88. doi: doi:10.1007/978-3-030-65474-0_4. · Zbl 1474.68193
[78] Stanley, R. P. (2006). An Introduction to Hyperplane Arrangements, IAS/Park City Mathematics Series.
[79] Stoltenberg-Hansen, V. and Tucker, J. V. (1999). Concrete models of computation for topological algebras. Theoretical Computer Science219 (1) 347-378. · Zbl 0916.68047
[80] Stoltenberg-Hansen, V. and Tucker, J. V. (1995). Effective algebras. In: Abramsky, S., Gabbay, D. M., and Maibaum, T. S. E. (eds.) Handbook of Logic in Computer Science, Semantic Modelling, vol. IV, Oxford University Press, 357-526.
[81] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I and Fergus, R. (2014). Intriguing Properties of Neural Networks. arXiv: 1312.6199 [cs.CV].
[82] Tankimanova, A. and James, A. P. (2018). Neural network-based analog-to-digital converters. In: James, A. P. (ed.) Memristor and Memristive Neural Networks, Rijeka, IntechOpen. Chap. 14. doi: doi:10.5772/intechopen.73038.
[83] Virmaux, A. and Scaman, K. (2018). Lipschitz regularity of deep neural networks: Analysis and efficient estimation. In: Bengio, S., Wallach, H. M., Larochelle, H., Grauman, K., Bianchi, N. C. and Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, 3839-3848.
[84] Wang, S., Pei, K., Whitehouse, J., Yang, J. and Jana, S. (2018). Formal security analysis of neural networks using symbolic intervals. In: Proceedings of the 27th USENIX Conference on Security Symposium. SEC’18, Baltimore, MD, USA, USENIX Association, 1599-1614.
[85] Weihrauch, K. (2000). Computable Analysis, An Introduction, Berlin Heidelberg, Springer. · Zbl 0956.68056
[86] Weng, T.-W., Zhang, H., Chen, P.-Y., Yi, J., Su, D., Gao, Y., Hsieh, C.-J. and Daniel, L. (2018a). Evaluating the robustness of neural networks: An extreme value theory approach. In: International Conference on Learning Representations.
[87] Weng, T.-W., Zhang, H., Chen, H., Song, Z., Hsieh, C.-J., Boning, D., Dhillon, I.S. and Daniel, L. (2018b). Towards fast computation of certified robustness for ReLU networks. In: International Conference on Machine Learning (ICML).
[88] Wong, E. and Kolter, J. Z. (2018). Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Dy, J. G. and Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, Proceedings of Machine Learning Research, vol. 80, PMLR, 5283-5292.
[89] Wong, E., Schmidt, F. R. and Kolter, J. Z. (2019). Wasserstein Adversarial Examples via Projected Sinkhorn Iterations. arXiv: 1902.07906 [cs.LG].
[90] Zombori, D., Bánhelyi, B., Csendes, T., Megyeri, I. and Jelasity, M. (2021). Fooling a complete neural network verifier. In: International Conference on Learning Representations.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.