×

Quantile regression with \(\ell_1\)-regularization and Gaussian kernels. (English) Zbl 1302.62151

Summary: The quantile regression problem is considered by learning schemes based on \(\ell_1\)-regularization and Gaussian kernels. The purpose of this paper is to present concentration estimates for the algorithms. Our analysis shows that the convergence behavior of \(\ell_1\)-quantile regression with Gaussian kernels is almost the same as that of the RKHS-based learning schemes. Furthermore, the previous analysis for kernel-based quantile regression usually requires that the output sample values are uniformly bounded, which excludes the common case with Gaussian noise. Our error analysis presented in this paper can give satisfactory convergence rates even for unbounded sampling processes. Besides, numerical experiments are given which support the theoretical results.

MSC:

62J02 General nonlinear regression
60E15 Inequalities; stochastic orderings
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337-404 (1950) · Zbl 0037.20701 · doi:10.1090/S0002-9947-1950-0051437-7
[2] Belloni, A., Chernozhukov, V.: ℓ1—penalized quantile regression in high dimensional sparse models. Ann. Stat. 39, 82-130 (2011) · Zbl 1209.62064 · doi:10.1214/10-AOS827
[3] Bennett, G.: Probability inequalities for the sum of independent random variables. J. Am. Stat. Assoc. 57, 33-45 (1962) · Zbl 0104.11905 · doi:10.1080/01621459.1962.10482149
[4] Bradley, P., Mangasarian, O.: Massive data discrimination via linear support vector machines. Optim. Methods Softw. 13, 1-10 (2000) · Zbl 0986.90085 · doi:10.1080/10556780008805771
[5] Cherkassky, V., Gehring, D., Mulier, F.: Comparison of adaptive methods for function estimation from samples. IEEE Trans. Neural Netw. 7, 969-984 (1996) · doi:10.1109/72.508939
[6] Christmann, A., Messem, A.V.: Bouligand derivatives and robustness of support vector machines for regression. J. Mach. Learn. Res. 9, 915-936 (2008) · Zbl 1225.68164
[7] Chen, D.R., Wu, Q., Ying, Y., Zhou, D.X.: Support vector machine soft margin classifiers: error analysis. J. Mach. Learn. Res. 5, 1143-1175 (2004) · Zbl 1222.68167
[8] Cucker, F., Zhou, D.X.: Learing Theory: An Approxiamtion Theory Viewpoint. Cambridge University Press, Cambridge (2007) · Zbl 1274.41001 · doi:10.1017/CBO9780511618796
[9] Eberts, M., Steinwart, I.: Optimal regression rates for SVMs using Gaussian kernels. Electron. J. Stat. 7, 1-42 (2013) · Zbl 1337.62073 · doi:10.1214/12-EJS760
[10] González, J., Rojas, I., Ortega, J., Pomares, H., Fernández, F.J., Díaz, A.F.: Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation. IEEE Trans. Neural Netw. 14, 1478-1495 (2003) · doi:10.1109/TNN.2003.820657
[11] Guo, Z.C., Zhou, D.X.: Concentration estimates for learning with unbounded sampling. Adv. Comput. Math. 38, 207-223 (2013) · Zbl 1283.68289 · doi:10.1007/s10444-011-9238-8
[12] Heagerty, P., Pepe, M.: Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children. J. Royal Stat. Soc. Ser. C 48, 533-551 (1999) · Zbl 1058.62520 · doi:10.1111/1467-9876.00170
[13] Huang, X., Jun, X., Wang, S.: Nonlinear system identification with continuous piecewise linear neural network. Neurocomputings 77, 167-177 (2012) · doi:10.1016/j.neucom.2011.09.001
[14] Huang, X., Shi, L., Suykens, J.A.K. Support Vector Machine Classifier with Pinball Loss. Internal Report 13-31, ESAT-SISTA, KU Leuven, Leuven · Zbl 1221.68201
[15] Koenker, R., Hallock, K.: Quantile regression: an introduction. J. Econ. Perspect. 15, 43-56 (2001) · doi:10.1257/jep.15.4.143
[16] Koenker, R., Geling, O.: Reappraising medfly longevity: a quantile regression survival analysis. J. Am. Stat. Assoc. 96, 458-468 (2001) · Zbl 1019.62100 · doi:10.1198/016214501753168172
[17] Koenker, R.: Quantile Regression. Cambridge Univeristy Press, Cambridge (2005) · Zbl 1111.62037 · doi:10.1017/CBO9780511754098
[18] Micchelli, C.A., Xu, Y., Zhang, H.: Universal kernles. J. Mach. Learn. Res. 7, 2651-2667 (2006) · Zbl 1222.68266
[19] Niyogi, P., Girosi, F.: On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. Neural Comput. 8, 819-842 (1996) · doi:10.1162/neco.1996.8.4.819
[20] Poggio, T., Girosi, F.: Networks for approximation and learning. Proc. IEEE 9, 1481-1497 (1990) · Zbl 1226.92005 · doi:10.1109/5.58326
[21] Suárez, A., Lutsko, J.F.: Globally optimal fuzzy decision trees for classification and regression. IEEE Trans. Pattern Anal. Mach. Intel. 21, 1297-1311 (1999) · doi:10.1109/34.817409
[22] Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton University Press, Princeton (1970) · Zbl 0207.13501
[23] Song, G., Zhang, H., Hickernell, F.J.: Reproducing kernel Banach spaces with the l1 norm. Appl. Comput. Harmonic Anal. 34, 96-116 (2013) · Zbl 1269.46020 · doi:10.1016/j.acha.2012.03.009
[24] Steinwart, I., Scovel, C.: Fast rates for support vector machines using Gaussian kernels. Ann. Stat. 35, 575-607 (2007) · Zbl 1127.68091 · doi:10.1214/009053606000001226
[25] Steinwart, I.: How to compare different loss functions and their risks. Construct. Approx. 26, 225-287 (2007) · Zbl 1127.68089 · doi:10.1007/s00365-006-0662-3
[26] Steinwart, I., Christmann, A.: Support Vector Machines. Springer-Verlag, New York (2008) · Zbl 1203.68171
[27] Steinwart, I., Christmann, A.: Estimate conditional quantiles with the help of the pinball loss. Bernoulli 17, 211-225 (2011) · Zbl 1284.62235 · doi:10.3150/10-BEJ267
[28] Shi, L., Feng, Y.L., Zhou, D.X.: Concentration estimates for learning with ℓ1—regularizer and data dependent hypothesis spaces. Appl. Comput. Harmonic Anal. 31, 286-302 (2011) · Zbl 1221.68201 · doi:10.1016/j.acha.2011.01.001
[29] Smale, S., Zhou, D.X.: Estimating the approximation error in learning theory. Appl. Anal. 1, 17-41 (2003) · Zbl 1079.68089 · doi:10.1142/S0219530503000089
[30] Takeuchi, I., Le, Q.V., Sears, T.D., Smola, A.J.: Nonparametric quantile estimation. J. Mach. Learn. Res. 7, 1231-1264 (2006) · Zbl 1222.68316
[31] Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267-288 (1996) · Zbl 0850.62538
[32] Van Der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer-Verlag, New York (1996) · Zbl 0862.60002 · doi:10.1007/978-1-4757-2545-2
[33] Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998) · Zbl 0935.62007
[34] Wang, C., Zhou, D.X.: Optimal learning rates for least squares regularized regression with unbounded sampling. J. Complex. 27, 55-67 (2011) · Zbl 1217.65024 · doi:10.1016/j.jco.2010.10.002
[35] Wendland, H.: Scattered Data Approximation. Cambridge University Press, Cambridge (2005) · Zbl 1075.65021
[36] Wahba, G.: Spline Models for Observational Data. Society for Industrial Mathematics (1990) · Zbl 0813.62001
[37] Wu, Q., Zhou, D.X.: SVM soft margin classifiers: linear programming versus quadratic programming. Neural Comput. 17, 1160-1187 (2005) · Zbl 1108.90324 · doi:10.1162/0899766053491896
[38] Wu, Q., Ying, Y., Zhou, D.X.: Multi-kernel regularized classifiers. J. Complex. 23, 108-134 (2007) · Zbl 1171.65043 · doi:10.1016/j.jco.2006.06.007
[39] Wu, Q., Zhou, D.X.: Learning with sample dependent hypothesis spaces. Comput. Math. Appl. 56, 2896-2907 (2008) · Zbl 1165.68388 · doi:10.1016/j.camwa.2008.09.014
[40] Wang, S., Huang, X., Yam, Y.: A neural network of smooth hinge functions. IEEE Trans. Neural Netw. 21, 1381-1395 (2010) · doi:10.1109/TNN.2010.2053383
[41] Xiang, D.H., Zhou, D.X.: Classification with Gaussians and convex loss. J. Mach. Learn. Res. 10, 1447-1468 (2009) · Zbl 1235.68207
[42] Xiang, D.H.: Conditional quantiles with varying Gaussians. Adv. Comput. Math. 38, 723-735 (2013) · Zbl 1358.62044 · doi:10.1007/s10444-011-9257-5
[43] Yu, K., Lu, Z., Stander, J.: Quantile regression: applications and current research areas. J. R. Stat. Soc. Ser. D 52, 331-350 (2003) · doi:10.1111/1467-9884.00363
[44] Zhou, X.J., Zhou, D.X.: High order Parzen windows and randomized sampling. Adv. Comput. Math. 31, 349-368 (2009) · Zbl 1183.68514 · doi:10.1007/s10444-008-9073-8
[45] Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541-2567 (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.