×

Almost optimal estimates for approximation and learning by radial basis function networks. (English) Zbl 1320.68141

The paper is devoted to approximation of differentiable multivariate functions by the radial basis function networks (RBFN) defined by a formula of the following type: \[ R(x)=\sum_{k=0}^N c_k \sigma (w_k|x - \theta_k|). \] Here \(\sigma\) is the activation function, \(c_k,w_k \in\mathbb R\), \(\theta_k \in\mathbb R^d\). Let \(B^d\) be the unit cube in \(\mathbb R^d\). The authors prove that for any given polynomial \(P\) and sufficiently smooth function \(\sigma\) there exists an RBFN approximating \(P\) arbitrarily closely in \(C(B^d)\). The authors also study machine learning. They prove that using the standard empirical risk minimization, the RBFN can realize an almost optimal learning rate.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Barron, A. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39, 930-945. · Zbl 0818.68126 · doi:10.1109/18.256500
[2] Barron, A., Cohen, A., Dahmen, W., & Devore, R. (2008). Approximation and learning by greedy algorithms. The Annals of Statistics, 36, 64-94. · Zbl 1138.62019 · doi:10.1214/009053607000000631
[3] Bejancu, A. (1997). The uniform convergence of multivariate natural splines (DAMTP Technical Report). University of Cambridge.
[4] Bejancu, A. (2000). On the accuracy of surface spline approximation and interpolation to bump functions (DAMTP Technical Report). University of Cambridge.
[5] Buhman, M. (2000). Radial basis functions. Acta Numerica, 9, 1-38. · Zbl 1004.65015 · doi:10.1017/S0962492900000015
[6] Bumann, M., Dyn, N., & Levin, D. (1995). On quasi-interpolation by radial basis functions with scattered centres. Constructive Approximation, 11, 239-254. · Zbl 0824.41010 · doi:10.1007/BF01203417
[7] Burger, M., & Neubauer, A. (2001). Error bounds for approximation with neural networks. Journal of Approximation Theory, 112, 235-250. · Zbl 1004.41007 · doi:10.1006/jath.2001.3613
[8] Caponnetto, A., & DeVito, E. (2007). Optimal rates for the regularized least squares algorithm. Foundations of Computational Mathematics, 7, 331-368. · Zbl 1129.68058 · doi:10.1007/s10208-006-0196-8
[9] Chen, T., & Chen, H. (1995). Approximation capability to functions of several variables, nonlinear functionals and operators by radial basis function neural networks. IEEE Transactions on Neural Networks, 6, 904-910. · doi:10.1109/72.392252
[10] Cucker, F., & Smale, S. (2001). On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39, 1-49. · Zbl 0983.68162 · doi:10.1090/S0273-0979-01-00923-5
[11] Cucker, F., & Smale, S. (2002). Best choices for regularization parameters in learning theory: on the bias-variance problem. Foundations of Computational Mathematics, 2, 413-428. · Zbl 1057.68085 · doi:10.1007/s102080010030
[12] DeVore, R., & Lorentz, G. (1993). Constructive approximation. Berlin: Springer. · Zbl 0797.41016 · doi:10.1007/978-3-662-02888-9
[13] DeVore, R., Kerkyacharian, G., Picard, D., & Temlyakov, V. (2006). Approximation methods for supervised learning. Foundations of Computational Mathematics, 6, 3-58. · Zbl 1146.62322 · doi:10.1007/s10208-004-0158-6
[14] Fedoseyev, A., Friedman, M., & Kansa, E. (2002). Improved multiquadric method for elliptic partial differential equations via PDE collocation on the boundary. Computers and Mathematics, 43, 439-455. · Zbl 0999.65137
[15] Flyer, N., & Wright, G. (2009). A radial basis function method for the shallow water equations on a sphere. Proceedings of the Royal Society A, 465, 1949-1976. · Zbl 1186.76664 · doi:10.1098/rspa.2009.0033
[16] Györfy, L., Kohler, M., Krzyzak, A., & Walk, H. (2002). A distribution-free theory of nonparametric regression. Berlin: Springer. · Zbl 1021.62024 · doi:10.1007/b97848
[17] Johnson, M. (1998). A bound on the approximation order of surface splines. Constructive Approximation, 14, 429-438. · Zbl 0904.41005 · doi:10.1007/s003659900082
[18] Kainen, P., Kurková, V., & Sanguineti, M. (2012). Dependence of computational models on input dimension: tractability of approximation and optimization tasks. IEEE Transactions on Information Theory, 58, 1203-1214. · Zbl 1365.68373 · doi:10.1109/TIT.2011.2169531
[19] Lin, S., Cao, F., & Xu, Z. (2011a). The essential rate of approximation for radial function manifold. Science China Mathematics, 54, 1985-1994. · Zbl 1241.41010 · doi:10.1007/s11425-011-4262-1
[20] Lin, S., Cao, F., & Xu, Z. (2011b). Essential rate for approximation by spherical neural networks. Neural Networks, 24, 752-758. · Zbl 1267.41002 · doi:10.1016/j.neunet.2011.04.005
[21] Maiorov, V. (2003). On best approximation of classes by radial functions. Journal of Approximation Theory, 120, 36-70. · Zbl 1013.41013 · doi:10.1016/S0021-9045(02)00011-4
[22] Maiorov, V. (2005). On lower bounds in radial basis approximation. Advances in Computational Mathematics, 22, 103-113. · Zbl 1064.41008 · doi:10.1007/s10444-004-1090-7
[23] Maiorov, V. (2006a). Pseudo-dimension and entropy of manifolds formed by affine invariant dictionary. Advances in Computational Mathematics, 25, 435-450. · Zbl 1100.68094 · doi:10.1007/s10444-004-7645-9
[24] Maiorov, V. (2006b). Approximation by neural networks and learning theory. Journal of Complexity, 22, 102-117. · Zbl 1156.68541 · doi:10.1016/j.jco.2005.09.001
[25] Maiorov, V., & Meir, R. (2001). Lower bounds for multivariate approximation by affine-invariant dictionaries. IEEE Transactions on Information Theory, 47, 1569-1575. · Zbl 1029.41017 · doi:10.1109/18.923738
[26] Maiorov, V., & Pinkus, A. (1999). Lower bounds for approximation by MLP neural networks. Neurocomputing, 25, 81-91. · Zbl 0931.68093 · doi:10.1016/S0925-2312(98)00111-8
[27] Mendelson, S., & Vershinin, R. (2003). Entropy and the combinatorial dimension. Inventiones Mathematicae, 125, 37-55. · Zbl 1039.60016 · doi:10.1007/s00222-002-0266-3
[28] Mhaskar, H. (1996). Neural networks for optimal approximation of smooth and analytic functions. Neural Computation, 8, 164-177. · doi:10.1162/neco.1996.8.1.164
[29] Mhaskar, H. (2004). On the tractability of multivariate integration and approximation by neural networks. Journal of Complexity, 20, 561-590. · Zbl 1344.65036 · doi:10.1016/j.jco.2003.11.004
[30] Park, J., & Sandberg, I. (1991). Universal approximation using radial-basis-function networks. Neural Computation, 3, 246-257. · doi:10.1162/neco.1991.3.2.246
[31] Park, J., & Sandberg, I. (1993). Approximation and radial-basis-function networks. Neural Computation, 5, 305-316. · doi:10.1162/neco.1993.5.2.305
[32] Powell, M.; Light, W. A. (ed.), The theory of radial basis approximation, No. 2, 105-210 (1990), Oxford · Zbl 0787.65005
[33] Schaback, R. (1995). Error estimates and conditions numbers for radial basis functions interpolations. Advances in Computational Mathematics, 3, 251-264. · Zbl 0861.65007 · doi:10.1007/BF02432002
[34] Schaback, R. (1996). Approximation by radial basis functions with finitely many centers. Constructive Approximation, 12, 331-340. · Zbl 0855.41011 · doi:10.1007/BF02433047
[35] Shivaswamy, P., & Jebara, T. (2007). Ellipsoidal machines. Journal of Machine Learning Research—Proceedings Track, 2, 484-491.
[36] Temlyakov, V. (2008). Approximation in learning theory. Constructive Approximation, 27, 33-74. · Zbl 05264756 · doi:10.1007/s00365-006-0655-2
[37] Wendland, H. (2000). Optimal approximation orders on Lp for radial basis functions. East Journal on Approximations, 6, 87-102. · Zbl 1084.41516
[38] Wendland, H. (2005). Scattered data approximation. New York: Cambridge University Press. · Zbl 1075.65021
[39] Xie, T., & Cao, F. (2010). The errors in simultaneous approximation by feed-forward neural networks. Neurocomputing, 73, 903-907. · doi:10.1016/j.neucom.2009.09.014
[40] Xie, T., & Cao, F. (2013). The rate of approximation of Gaussian radial basis neural networks in continuous function space. Acta Mathematica Sinica. English Series, 29, 295-302. · Zbl 1263.41009 · doi:10.1007/s10114-012-1369-4
[41] Zhang, Y., Cao, F., & Xu, Z. (2011). Optimal rate of the regularized regression learning algorithm. International Journal of Computer Mathematics, 88, 1471-1483. · Zbl 1218.68141 · doi:10.1080/00207160.2010.516821
[42] Zhou, D. X., & Jetter, K. (2006). Approximation with polynomial kernels and SVM classifiers. Advances in Computational Mathematics, 25, 323-344. · Zbl 1095.68103 · doi:10.1007/s10444-004-7206-2
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.