Abstract
Deep Learning success in a wide range of applications, such as image recognition and natural language processing, has led to the increasing usage of this technology in many domains, including safety-critical applications such as autonomous cars and medicine. The usage of the models, e.g., neural networks, in safety critical applications demands a thorough evaluation from a component and system level perspective. In these domains, formal methods have the ability to guarantee the correct operation of these components. Despite great efforts in the formal verification of neural networks in the past decade, several challenges remain. One of these challenges is the development of neural networks for easier verification. In this work, we present an empirical analysis, presented as a Latin Hypercube experiment design, in which we evaluate how regularization and initialization methods across different random seeds on two datasets affect the verification analysis of a reachability analysis technique for the verification of neural networks. We show that there are certain training routines that simplify the formal verification task. Lastly, a discussion on how these training approaches impact the robustness verification and reachability computation of the method utilized is included.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
vnnlib: https://www.vnnlib.org.
- 2.
Available at https://github.com/Project-MONAI/MONAI/.
- 3.
Weights are independently sampled from a normal distribution with 0 mean and standard deviation of 0.01.
- 4.
The model architecture is depicted in Fig. 9 in the Appendix.
- 5.
Code is available at: https://github.com/verivital/nnv/tree/master/code/nnv/examples/Submission/AISOLA2023.
References
apolanco3225: Medical mnist classification. https://github.com/apolanco3225/Medical-MNIST-Classification (2017)
Bak, S.: nnenum: verification of ReLU neural networks with optimized abstraction refinement. In: Dutle, A., Moscato, M.M., Titolo, L., Muñoz, C.A., Perez, I. (eds.) NASA Formal Methods: 13th International Symposium, NFM 2021, Virtual Event, May 24–28, 2021, Proceedings, pp. 19–36. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76384-8_2
Bak, S., Duggirala, P.S.: Simulation-equivalent reachability of large linear systems with inputs. In: Majumdar, R., Kunčak, V. (eds.) Computer Aided Verification, pp. 401–420. Springer International Publishing, Cham (2017)
Böing, B., Müller, E.: On training and verifying robust autoencoders. In: 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2022). https://doi.org/10.1109/DSAA54385.2022.10032334
Bunel, R., et al.: Branch and bound for piecewise linear neural network verification. J. Mach. Learn. Res. 21(1) (2020)
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp. 160–167. ACM (2008)
Cruz, U.S., Ferlez, J., Shoukry, Y.: Safe-by-repair: A convex optimization approach for repairing unsafe two-level lattice neural network controllers. In: 2022 IEEE 61st Conference on Decision and Control (CDC), pp. 3383–3388 (2022). https://doi.org/10.1109/CDC51059.2022.9993239
Dong, G., Sun, J., Wang, J., Wang, X., Dai, T.: Towards repairing neural networks correctly (2021)
Dreossi, T., Ghosh, S., Yue, X., Keutzer, K., Sangiovanni-Vincentelli, A., Seshia, S.A.: Counterexample-guided data augmentation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2071–2078. IJCAI’18, AAAI Press (2018)
Elboher, Y.Y., Cohen, E., Katz, G.: Neural network verification using residual reasoning. In: Schlingloff, B.-H., Chai, M. (eds.) Software Engineering and Formal Methods: 20th International Conference, SEFM 2022, Berlin, Germany, September 26–30, 2022, Proceedings, pp. 173–189. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17108-6_11
Fisher, R.A.: Statistical methods for research workers. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics, pp. 66–70. Springer, New York, NY (1992). https://doi.org/10.1007/978-1-4612-4380-9_6
Fu, F., Li, W.: Sound and complete neural network repair with minimality and locality guarantees. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=xS8AMYiEav3
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Teh, Y.W., Titterington, M. (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 9, pp. 249–256. PMLR, Chia Laguna Resort, Sardinia, Italy (2010)
Goldberger, B., Katz, G., Adi, Y., Keshet, J.: Minimal modifications of deep neural networks using verification. In: Albert, E., Kovacs, L. (eds.) LPAR23. LPAR-23: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning. EPiC Series in Computing, vol. 73, pp. 260–278. EasyChair (2020). https://doi.org/10.29007/699q, https://easychair.org/publications/paper/CWhF
Goubault, E., Putot, S.: Rino: robust inner and outer approximated reachability of neural networks controlled systems. In: Shoham, S., Vizel, Y. (eds.) Computer Aided Verification, pp. 511–523. Springer, Cham (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015). https://doi.org/10.1109/ICCV.2015.123
Hoffman, J., Roberts, D.A., Yaida, S.: Robust learning with Jacobian regularization (2019)
Huang, Y., Zhang, H., Shi, Y., Kolter, J.Z., Anandkumar, A.: Training certifiably robust neural networks with efficient local lipschitz bounds. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 22745–22757. Curran Associates, Inc. (2021)
Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5
Katz, G., et al.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Proceedings of the 4th International Conference on Neural Information Processing Systems, pp. 950–957. NIPS’91, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1991)
LeCun, Y., Cortes, C., Burges, C.: Mnist handwritten digit database. ATT Labs http://yann.lecun.com/exdb/mnist 2 (2010)
Leofante, F., Henriksen, P., Lomuscio, A.: Verification-friendly networks: the case for parametric relus. In: Workshop on Formal Verification of Machine Learning, Colocated with ICML 2022. IEEE (2022)
Liu, C., Arnon, T., Lazarus, C., Strong, C., Barrett, C., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optimization 4(3–4), 244–404 (2021). https://doi.org/10.1561/2400000035
Lopez, D.M., et al.: Arch-comp22 category report: Artificial intelligence and neural network control systems (ainncs) for continuous and hybrid systems plants. In: Frehse, G., Althoff, M., Schoitsch, E., Guiochet, J. (eds.) Proceedings of 9th International Workshop on Applied Verification of Continuous and Hybrid Systems (ARCH22). EPiC Series in Computing, vol. 90, pp. 142–184. EasyChair (2022)
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Lopez, D.M., Choi, S.W., Tran, H.-D., Johnson, T.T.: NNV 2.0: the neural network verification tool. In: Enea, C., Lal, A. (eds.) Computer Aided Verification: 35th International Conference, CAV 2023, Paris, France, July 17–22, 2023, Proceedings, Part II, pp. 397–412. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-37703-7_19
Müller, M.N., Brix, C., Bak, S., Liu, C., Johnson, T.T.: The third international verification of neural networks competition (vnn-comp 2022): Summary and results (2022)
OpenAI: Gpt-4 technical report (2023)
Ren, X., et al.: Few-shot guided mix for dnn repairing. In: Proceedings - 2020 IEEE International Conference on Software Maintenance and Evolution, ICSME 2020, pp. 717–721. Proceedings - 2020 IEEE International Conference on Software Maintenance and Evolution, ICSME 2020, Institute of Electrical and Electronics Engineers Inc., United States (2020). https://doi.org/10.1109/ICSME46990.2020.00079
Shi, Z., Wang, Y., Zhang, H., Yi, J., Hsieh, C.J.: Fast certified robust training with short warmup. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems. vol. 34, pp. 18335–18349. Curran Associates, Inc. (2021). https://proceedings.neurips.cc/paper_files/paper/2021/file/988f9153ac4fd966ea302dd9ab9bae15-Paper.pdf
Singh, G., Gehr, T., Mirman, M., Püschel, M., Vechev, M.: Fast and effective robustness certification. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
Sinitsin, A., Plokhotnyuk, V., Pyrkin, D., Popov, S., Babenko, A.: Editable neural networks. In: International Conference on Learning Representations (2020)
Sotoudeh, M., Thakur, A.V.: Provable repair of deep neural networks. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 588–603. PLDI 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3453483.3454064
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). http://jmlr.org/papers/v15/srivastava14a.html
Tao, Z., Nawas, S., Mitchell, J., Thakur, A.V.: Architecture-preserving provable repair of deep neural networks. Proc. ACM Program. Lang. 7(PLDI), 443–467 (2023). https://doi.org/10.1145/3591238
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. arXiv preprint arXiv:1711.07356 (2017)
Tran, H.-D., Bak, S., Xiang, W., Johnson, T.T.: Verification of deep convolutional neural networks using imagestars. In: Lahiri, S.K., Wang, C. (eds.) Computer Aided Verification: 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21–24, 2020, Proceedings, Part I, pp. 18–42. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_2
Tran, H.D., Choi, S., Okamoto, H., Hoxha, B., Fainekos, G., Prokhorov, D.: Quantitative verification for neural networks using probstars. In: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control. HSCC ’23, Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3575870.3587112
Tran, H.-D., et al.: Star-based reachability analysis of deep neural networks. In: ter Beek, M.H., McIver, A., Oliveira, J.N. (eds.) Formal Methods – The Next 30 Years: Third World Congress, FM 2019, Porto, Portugal, October 7–11, 2019, Proceedings, pp. 670–686. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30942-8_39
Tran, H.D., et al.: Verification of piecewise deep neural networks: A star set approach with zonotope pre-filter. Form. Asp. Comput. 33(4–5), 519–545 (2021)
Tran, H.-D., et al.: Robustness verification of semantic segmentation neural networks using relaxed reachability. In: Silva, A., Leino, K.R.M. (eds.) Computer Aided Verification: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I, pp. 263–286. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81685-8_12
Tran, H.D., et al.: NNV: The neural network verification tool for deep neural networks and learning-enabled cyber-physical systems. In: 32nd International Conference on Computer-Aided Verification (CAV) (2020)
Usman, M., Gopinath, D., Sun, Y., Noller, Y., Păsăreanu, C.S.: Nnrepair: constraint-based repair of neural network classifiers. In: Silva, A., Leino, K.R.M. (eds.) Computer Aided Verification, pp. 3–25. Springer, Cham (2021)
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Formal security analysis of neural networks using symbolic intervals. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp. 1599–1614 (2018)
Wang, S., et al.: Beta-CROWN: Efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification. In: Advances in Neural Information Processing Systems 34 (2021)
Xiao, K.Y., Tjeng, V., Shafiullah, N.M.M., Madry, A.: Training for faster adversarial robustness verification via inducing reLU stability. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=BJfIVjAcKm
Yang, X., Yamaguchi, T., Tran, H.D., Hoxha, B., Johnson, T.T., Prokhorov, D.: Neural network repair with reachability analysis. In: Bogomolov, S., Parker, D. (eds.) Formal Modeling and Analysis of Timed Systems, pp. 221–236. Springer, Cham (2022)
Acknowledgements
The material presented in this paper is based upon work supported by the National Science Foundation (NSF) through grant numbers 2028001, 2220426 and 2220401, the Defense Advanced Research Projects Agency (DARPA) under contract number FA8750-23-C-0518, and the Air Force Office of Scientific Research (AFOSR) under contract number FA9550-22-1-0019 and FA9550-23-1-0135. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of AFOSR, DARPA, or NSF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Manzanas Lopez, D., Johnson, T.T. (2024). Empirical Analysis of Benchmark Generation for the Verification of Neural Network Image Classifiers. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-46002-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)