×

Distributed learning and distribution regression of coefficient regularization. (English) Zbl 1479.62023

Summary: In this paper, we study the distributed learning algorithm and the distribution regression problem of coefficient regularization for Mercer kernels. By utilizing divided-and-conquer approach, we partition a data set into disjoint data subsets for different learning machines, and get the global estimator from local estimators. By using second order decomposition on the difference of operator inverse and properties of trace operator, we show that under some priori conditions of regression function, the result of distributed learning algorithm is as good as that in single batch data algorithm. On the other hand, we give a learning rate of distribution regression problem under the coefficient regularization scheme by using similar operator methods. We find that our learning scheme performs well when the regression function has stronger regularity. And we can see the deep relation of these two different problems.

MSC:

62G08 Nonparametric regression and quantile regression
62R07 Statistical aspects of big data and data science
68W15 Distributed algorithms
Full Text: DOI

References:

[1] Berlinet, A.; Thomas-Agnan, C., Reproducing Kernel Hilbert Spaces in Probability and Statistics (2004), Kluwer · Zbl 1145.62002
[2] Caponnetto, A.; Yao, Y., Cross-validation based adaptation for regularization operators in learning theory, Anal. Appl. (Singap.), 8, 2, 161-183 (2010) · Zbl 1209.68405
[3] Cucker, F.; Smale, S., On the mathematical foundations of learning, Bull. Amer. Math. Soc. (N.S.), 39, 1, 1-49 (2002) · Zbl 0983.68162
[4] De Vito, E.; Caponnetto, A.; Rosasco, L., Model selection for regularized least-squares algorithm in learning theory, Found. Comput. Math., 5, 1, 59-85 (2005) · Zbl 1083.68106
[5] De Vito, E.; Pereverzyev, S.; Rosasco, L., Adaptive kernel methods using the balancing principle, Found. Comput. Math., 10, 4, 455-479 (2010) · Zbl 1204.68154
[6] Fang, Z.; Guo, Z.-C.; Zhou, D.-X., Optimal learning rates for distribution regression, J. Complexity, Article 101426 pp. (2019) · Zbl 1435.62259
[7] Grafakos, L., Classical Fourier Analysis (2008), Springer · Zbl 1220.42001
[8] Gretton, A.; Borgwardt, K.; Rasch, M. J.; Schölkopf, B.; Smola, A. J., A Kernel Method for the Two-Sample-Problem, 513-520 (2007), The MIT Press
[9] Guo, Z.-C.; Lin, S.-B.; Shi, L., Distributed learning with multi-penalty regularization, Appl. Comput. Harmon. Anal., 46, 3, 478-499 (2019) · Zbl 1431.68107
[10] Guo, Z.-C.; Lin, S.-B.; Zhou, D.-X., Learning theory of distributed spectral algorithms, Inverse Problems, 33, 7, Article 074009 pp. (2017), 29 · Zbl 1372.65162
[11] Guo, X.; Zhou, D.-X., An empirical feature-based learning algorithm producing sparse approximations, Appl. Comput. Harmon. Anal., 32, 3, 389-400 (2012) · Zbl 1319.62119
[12] Lei, Y.; Zhou, D.-X., Convergence of online mirror descent, Appl. Comput. Harmon. Anal., 48, 1, 343-373 (2020) · Zbl 1494.68219
[13] Lin, S.-B.; Guo, X.; Zhou, D.-X., Distributed learning with regularized least squares, J. Mach. Learn. Res., 18, 92, 31 (2017) · Zbl 1435.68273
[14] Smale, S.; Zhou, D.-X., Shannon sampling. II. Connections to learning theory, Appl. Comput. Harmon. Anal., 19, 3, 285-302 (2005) · Zbl 1107.94008
[15] Smale, S.; Zhou, D.-X., Learning theory estimates via integral operators and their approximations, Constr. Approx., 26, 2, 153-172 (2007) · Zbl 1127.68088
[16] Smale, S.; Zhou, D.-X., Online learning with Markov sampling, Anal. Appl. (Singap.), 7, 1, 87-113 (2009) · Zbl 1170.68022
[17] Sun, H.; Wu, Q., Least square regression with indefinite kernels and coefficient regularization, Appl. Comput. Harmon. Anal., 30, 1, 96-109 (2011) · Zbl 1225.65015
[18] Szabó, Z.; Gretton, A.; Póczos, B.; Sriperumbudur, B., Two-stage sampled learning theory on distributions, Artif. Intell. Statist., 948-957 (2015)
[19] Szabó, Z.; Sriperumbudur, B. K.; Póczos, B.; Gretton, A., Learning theory for distribution regression, J. Mach. Learn. Res., 17, 152, 1-40 (2016) · Zbl 1392.62124
[20] Vapnik, V., Statistical Learning Theory (1998), Wiley: Wiley New York · Zbl 0935.62007
[21] Q. Wu, Bias corrected regularization kernel network and its applications, in: 2017 International Joint Conference on Neural Networks, IJCNN, 2017, pp. 1072-1079.
[22] Zhang, Y.; Duchi, J.; Wainwright, M., Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates, J. Mach. Learn. Res., 16, 3299-3340 (2015) · Zbl 1351.62142
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.