×

Approximation of functions from Korobov spaces by shallow neural networks. (English) Zbl 07854211

Summary: In this paper, we consider the problem of approximating functions from a Korobov space on \([-1, 1]^d\) by ReLU shallow neural networks and present a rate \(\mathcal{O}\left(m^{-\frac{2}{5}(1 + \frac{2}{d})}\sqrt{\log m}\right)\) of uniform approximation by networks of \(m\) hidden neurons. This is achieved by combining a novel Fourier analysis approach and a probability argument. We apply our approximation theory to a learning algorithm for regression based on ReLU shallow neural networks and derive learning rates of order \(\mathcal{O}\left(N^{-\frac{4(d + 2)}{9d + 8}}\log N\right)\) for the excess generalization error with the sample size \(N\) when the regression function lies in the Korobov space.

MSC:

68-XX Computer science
41-XX Approximations and expansions
Full Text: DOI

References:

[1] Bungartz, H.-J.; Griebel, M., Sparse grids, Acta Numer., 13, 147-269, 2004 · Zbl 1118.65388
[2] Carleson, L., On convergence and growth of partial sums of Fourier series, Acta Math., 116, 135-157, 1966 · Zbl 0144.06402
[3] Chen, T.; Chen, H.; Liu, R., A constructive proof of Cybenko’s approximation theorem and its extensions, (LePage, C.; Page, R., Computing Science and Statistics, Proc. of the 22nd Symposium on Interface. Computing Science and Statistics, Proc. of the 22nd Symposium on Interface, East Lansing, May 1990, Springer-Verlag), 163-168 · Zbl 0746.41022
[4] Chen, T.; Chen, H., Approximation to continuous functionals by neural networks with application to dynamical systems, IEEE Trans. Neural Netw., 4, 910-918, 1993
[5] Chen, T.; Chen, H.; Liu, R., Approximation capability in \(C( \mathbb{R}^n)\) by multilayer feedforward networks and related problems, IEEE Trans. Neural Netw., 6, 25-30, 1995
[6] Chui, C. K.; Li, X.; Mhaskar, H. N., Neural networks for localized approximation, Math. Comput., 63, 607-623, 1994 · Zbl 0806.41020
[7] Chui, C. K.; Lin, S. B.; Zhang, B.; Zhou, D. X., Realization of spatial sparseness by deep ReLU nets with massive data, IEEE Trans. Neural Netw. Learn. Syst., 33, 229-243, 2022
[8] Chui, C. K.; Lin, S. B.; Zhou, D. X., Deep neural networks for rotation-invariance approximation and learning, Anal. Appl., 17, 737-772, 2019 · Zbl 1423.68378
[9] Chui, C. K.; Lin, S. B.; Zhou, D. X., Deep net tree structure for balance of capacity and approximation ability, Front. Appl. Math. Stat., 5, 46, 2019
[10] Cucker, F.; Zhou, D. X., Learning Theory: an Approximation Theory Viewpoint, 2007, Cambridge University Press: Cambridge University Press Cambridge · Zbl 1274.41001
[11] Cybenko, G., Approximation by superpositions of a sigmoidal function, Math. Control Signals Syst., 2, 303-314, 1989 · Zbl 0679.94019
[12] DeVore, R. A.; Lorentz, G. G., Constructive Approximation, 1993, Springer-Verlag: Springer-Verlag Berlin, Heidelberg · Zbl 0797.41016
[13] Fefferman, C., On the convergence of multiple Fourier series, Bull. Am. Math. Soc., 77, 744-745, 1971 · Zbl 0234.42008
[14] Guo, Z. C.; Xiang, D. H.; Guo, X.; Zhou, D. X., Thresholded spectral algorithms for sparse approximations, Anal. Appl., 15, 433-455, 2017 · Zbl 1409.68232
[15] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, Neural Netw., 2, 359-366, 1989 · Zbl 1383.92015
[16] Klusowski, J. M.; Barron, A. R., Approximation by combinations of ReLU and squared ReLU ridge functions with \(\ell^1\) and \(\ell^0\) controls, IEEE Trans. Inf. Theory, 64, 7649-7656, 2018 · Zbl 1432.41003
[17] Kolmogorov, A.; Uspenskii, V., On the definition of an algorithm, Usp. Mat. Nauk, 13, 4, 3-28, 1958 · Zbl 0090.01101
[18] Mao, T.; Zhou, D. X., Rates of approximation by ReLU shallow neural networks, J. Complex., 79, Article 101784 pp., 2023 · Zbl 1524.68322
[19] Mao, T.; Zhou, D. X., Approximation of functions from Korobov spaces by deep convolutional neural networks, Adv. Comput. Math., 48, 6, 84, 2022 · Zbl 07634934
[20] Mhaskar, H. N., Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math., 1, 61-80, 1993 · Zbl 0824.41011
[21] Montanelli, H.; Du, Q., New error bounds for deep ReLU networks using sparse grids, SIAM J. Math. Data Sci., 1, 78-92, 2019 · Zbl 1513.68054
[22] Petersen, P.; Voigtlaender, V., Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Netw., 108, 296-330, 2018 · Zbl 1434.68516
[23] Pinkus, A., Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw., 6, 861-867, 1993
[24] Schmidt-Hieber, J., Nonparametric regression using deep neural networks with ReLU activation function, Ann. Stat., 48, 1875-1897, 2020 · Zbl 1459.62059
[25] Siegel, J.; Xu, J. C., Sharp bounds on the approximation rates, metric entropy, and n-widths of shallow neural networks, Found. Comput. Math., 1-57, 2022
[26] Shaham, U.; Cloninger, A.; Coifman, R., Provable approximation properties for deep neural networks, Appl. Comput. Harmon. Anal., 44, 537-557, 2018 · Zbl 1390.68553
[27] Stein, E. M., Singular Integrals and Differentiability Properties of Functions, Princeton Mathematical Series, vol. 30, 2016, Princeton University Press
[28] Suzuki, T., Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality, (Proceedings of the International Conference on Learning Representations (ICLR), 2019)
[29] Trigub, R. M., Absolute convergence of Fourier integrals, summability of Fourier series and polynomial approximation of functions on the torus, Math. USSR, Izv., 44, 1378-1409, 1980, Russian original · Zbl 0459.42019
[30] Yarotsky, D., Error bounds for approximations with deep ReLU networks, Neural Netw., 94, 103-114, 2017 · Zbl 1429.68260
[31] Zhou, D. X., Deep distributed convolutional neural networks: universality, Anal. Appl., 16, 895-919, 2018 · Zbl 1442.68214
[32] Zhou, D. X., The covering number in learning theory, J. Complex., 18, 739-767, 2002 · Zbl 1016.68044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.