×

Equivariant neural networks for indirect measurements. (English) Zbl 07892806

Summary: In recent years, deep learning techniques have shown great success in various tasks related to inverse problems, where a target quantity of interest can only be observed through indirect measurements by a forward operator. Common approaches apply deep neural networks in a postprocessing step to the reconstructions obtained by classical reconstruction methods. However, the latter methods can be computationally expensive and introduce artifacts that are not present in the measured data and, in turn, can deteriorate the performance on the given task. To overcome these limitations, we propose a class of equivariant neural networks that can be directly applied to the measurements to solve the desired task. To this end, we build appropriate network structures by developing layers that are equivariant with respect to data transformations induced by well-known symmetries in the domain of the forward operator. We rigorously analyze the relation between the measurement operator and the resulting group representations and prove a representer theorem that characterizes the class of linear operators that translate between a given pair of group actions. Based on this theory, we extend the existing concepts of Lie group equivariant deep learning to inverse problems and introduce new representations that result from the involved measurement operations. This allows us to efficiently solve classification, regression, or even reconstruction tasks based on indirect measurements also for very sparse data problems, where a classical reconstruction-based approach may be hard or even impossible. We illustrate the effectiveness of our approach in numerical experiments and compare with existing methods.

MSC:

47B38 Linear operators on function spaces (general)
65J22 Numerical solution to inverse problems in abstract spaces
94A05 Communication theory

Software:

Adam; ASTRA; PyTorch; MNIST

References:

[1] Arridge, S., Maass, P., Öktem, O., and Schönlieb, C.-B., Solving inverse problems using data-driven models, Acta Numer., 28 (2019), pp. 1-174, doi:10.1017/S0962492919000059. · Zbl 1429.65116
[2] Celledoni, E., Ehrhardt, M. J., Etmann, C., Owren, B., Schönlieb, C.-B., and Sherry, F., Equivariant neural networks for inverse problems, Inverse Problems, 37 (2021), 085006, doi:10.1088/1361-6420/ac104f. · Zbl 1538.68039
[3] Chen, D., Davies, M. E., Ehrhardt, M. J., Schönlieb, C.-B., Sherry, F., and Tachella, J., Imaging with equivariant deep learning: From unrolled network design to fully unsupervised learning, IEEE Signal Process. Mag., 40 (2023), pp. 134-147, doi:10.1109/MSP.2022.3205430.
[4] Chen, D., Tachella, J., and Davies, M. E., Equivariant imaging: Learning beyond the range space, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), , 2021, pp. 4359-4368, doi:10.1109/ICCV48922.2021.00434.
[5] Chenouard, N. and Unser, M., 3D steerable wavelets in practice, IEEE Trans. Image Process., 21 (2012), pp. 4522-4533, doi:10.1109/TIP.2012.2206044. · Zbl 1373.42043
[6] Cohen, T. S., Geiger, M., and Weiler, M., A general theory of equivariant CNNs on homogeneous spaces, in Proceedings of Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2019.
[7] Cohen, T. S. and Welling, M., Group equivariant convolutional networks, in Proceedings of the 33th International Conference on Machine Learning, , 2016, pp. 2990-2999.
[8] Cohen, T. S. and Welling, M., Steerable CNNs, in Proceedings of the International Conference on Learning Representations, , 2017.
[9] Dax, M., Green, S. R., Gair, J., Deistler, M., Schölkopf, B., and Macke, J. H., Group equivariant neural posterior estimation, in Proceedings of the 10th International Conference on Learning Representations (ICLR), , 2022, pp. 1-24.
[10] Deng, L., The MNIST database of handwritten digit images for machine learning research, IEEE Signal Process. Mag., 29 (2012), pp. 141-142.
[11] Dittmer, S., Erzmann, D., Harms, H., and Maass, P., SELTO: Sample-Efficient Learned Topology Optimization, 2022, https://arxiv.org/abs/2209.05098.
[12] Doherty, K., Simpson, C., Becker, S., and Doostan, A., QuadConv: Quadrature-Based Convolutions with Applications to Non-uniform PDE Data Compression, https://arxiv.org/abs/2211.05151, 2022.
[13] Engl, H. W., Hanke, M., and Neubauer, A., Regularization of Inverse Problems, , Kluwer, Dordrecht, 2000.
[14] Finzi, M., Stanton, S., Izmailov, P., and Wilson, A. G., Generalizing convolutional neural networks for equivariance to Lie groups on arbitrary continuous data, in Proceedings of the 37th International Conference on Machine Learning, , 2020, pp. 3165-3176.
[15] Fukushima, K., Cognitron: A self-organizing multilayered neural network, Biol. Cybernet., 20 (1975), pp. 121-136, doi:10.1007/BF00342633.
[16] Gel’fand, I. M. and Shilov, G. E., Generalized Functions. Volume I: Properties and Operations, Academic Press, New York, 1964. · Zbl 0115.33101
[17] He, K., Zhang, X., Ren, S., and Sun, J., Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), , 2015, pp. 770-778.
[18] Hörmander, L., The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis, 2nd ed., , Springer, Berlin, 1990. · Zbl 0712.35001
[19] Ioffe, S. and Szegedy, C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning, , 2015, pp. 448-456.
[20] Kaipio, J. P. and Somersalo, E., Statistical and Computational Inverse Problems, , Springer, New York, 2005. · Zbl 1068.65022
[21] Kingma, D. P. and Ba, J., Adam: A Method for Stochastic Optimization, https://arxiv.org/abs/1412.6980, 2014.
[22] Kondor, R. and Trivedi, S., On the generalization of equivariance and convolution in neural networks to the action of compact groups, in Proceedings of the 35th International Conference on Machine Learning, , 2018, pp. 2747-2755.
[23] Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and Bengio, Y., An empirical evaluation of deep architectures on problems with many factors of variation, in Proceedings of the 24th International Conference on Machine Learning, , 2007, pp. 473-480, doi:10.1145/1273496.1273556.
[24] LeCun, Y., Jackel, L. D., Bottou, L., Cortes, C., Denker, J. S., Drucker, H., Guyon, I., Müller, U. A., Säckinger, E., Simard, P., and Vapnik, V., Learning algorithms for classification: A comparison on handwritten digit recognition, in Neural Networks: The Statistical Mechanics Perspective, World Scientific, Englewood Cliffs, NJ, 1995, pp. 261-276.
[25] Natterer, F. and Wübbeling, F., Mathematical Methods in Image Reconstruction, , SIAM, Philadelphia, 2001. · Zbl 0974.92016
[26] Otero Baguer, D., Leuschner, J., and Schmidt, M., Computed tomography reconstruction using deep image prior and learned reconstruction methods, Inverse Problems, 36 (2020), 094004, doi:10.1088/1361-6420/aba415. · Zbl 1460.68093
[27] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S., PyTorch: An imperative style, high-performance deep learning library, in Proceedings of Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2019.
[28] Ramachandran, P., Zoph, B., and Le, Q. V., Searching for Activation Functions, https://arxiv.org/abs/1710.05941, 2017.
[29] Tang, J., Equivariance Regularization for Image Reconstruction, https://arxiv.org/abs/2202.05062, 2022.
[30] Treves, F., Differentiable functions with values in topological vector spaces. Tensor product of distributions, in Topological Vector Spaces, Distributions and Kernels, , Elsevier, Amsterdam, 1967, pp. 411-419, doi:10.1016/S0079-8169(08)60289-5.
[31] van Aarle, W., Palenstijn, W. J., Cant, J., Janssens, E., Bleichrodt, F., Dabravolski, A., De Beenhouwer, J., Batenburg, K. J., and Sijbers, J., Fast and flexible X-ray tomography using the ASTRA toolbox, Opt. Express, 24 (2016), pp. 25129-25147, doi:10.1364/OE.24.025129.
[32] van Aarle, W., Palenstijn, W. J., De Beenhouwer, J., Altantzis, T., Bals, S., Batenburg, K. J., and Sijbers, J., The ASTRA toolbox: A platform for advanced algorithm development in electron tomography, Ultramicroscopy, 157 (2015), pp. 35-47, doi:10.1016/j.ultramic.2015.05.002.
[33] Winkels, M. and Cohen, T. S., Pulmonary nodule detection in CT scans with equivariant CNNs, Med. Image Anal., 55 (2019), pp. 15-26, doi:10.1016/j.media.2019.03.010.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.