×

An asymptotic statistical analysis of support vector machines with soft margins. (English) Zbl 1071.68538

Summary: The generalization properties of Support Vector Machines (SVMs) are examined. From a geometrical point of view, the estimated parameter of an SVM is the one nearest the origin in the convex hull formed with given examples. Since introducing soft margins is equivalent to reducing the convex hull of the examples, an SVM with soft margins has a different learning curve from the original. In this paper we derive the asymptotic average generalization error of SVMs with soft margins in simple cases, that is, only when the dimension of inputs is one, and quantitatively show that soft margins increase the generalization error.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Amari, S., A universal theorem on learning curves, Neural Networks, 6, 161-166 (1993)
[2] Amari, S.; Fujita, N.; Shinomoto, S., Four types of learning curves, Neural Computation, 4, 605-618 (1992)
[3] Amari, S.; Murata, N., Statistical theory of learning curves under entropic loss criterion, Neural Computation, 5, 140-153 (1993)
[4] Baum, E. B.; Haussler, D., What size net gives valid generalization?, Neural Computation, 1, 151-160 (1989)
[5] Bennett, K. P.; Bredensteiner, E. J., Duality and geometry in SVM classifiers, (Langley, P., Proceedings of the seventeenth international conference on machine learning (2000), Morgan Kaufmann: Morgan Kaufmann San Francisco), 57-64
[6] Cristianini, N.; Shawe-Taylor, J., An introduction to support vector machines (2000), Cambridge University Press: Cambridge University Press Cambridge, UK
[7] Dietrich, R.; Opper, M.; Sompolinsky, H., Statistical mechanics of support vector networks, Physical Review Letters, 82, 14, 2975-2978 (1999)
[8] Ikeda, K., Generalization error analysis for polynomial kernel methods—algebraic geometrical approach, (Kaynak, O.; etal., Artificial neural networks and neural information processing—ICANN/ICONIP 2003 (2003), Springer: Springer New York), 201-208 · Zbl 1037.68657
[9] Ikeda, K., Geometry and learning curves of kernel methods with polynomial kernels, Systems and Computers in Japan, 35, 7, 41-48 (2004)
[10] Ikeda, K., An asymptotic statistical theory of polynomial kernel methods, Neural Computation, 16, 8, 1705-1719 (2004) · Zbl 1102.68561
[11] Ikeda, K.; Amari, S., Geometry of admissible parameter region in neural learning, IEICE Transactions on Fundamentals, E79, 409-414 (1996)
[12] Ikeda, K., & Aoishi, T. (2002). Perceptron learning admissible to noise. Proceedings of forum on information technology, Tokyo, H-9; Ikeda, K., & Aoishi, T. (2002). Perceptron learning admissible to noise. Proceedings of forum on information technology, Tokyo, H-9
[13] Opper, M.; Haussler, D., Calculation of the learning curve of bayes optimal classification on algorithm for learning a perceptron with noise, (Valiant, L. G.; Warmuth, M. K., Proceedings of the fourth annual workshop on computational learning theory (1991), Morgan Kaufmann: Morgan Kaufmann San Francisco), 75-87
[14] Opper, M.; Urbanczik, R., Universal learning curves of support vector machines, Physical Review Letters, 86, 19, 4410-4413 (2001) · Zbl 0980.68089
[15] Risau-Gusman, S.; Gordon, M. B., Generalization properties of finite-size polynomial support vector machines, Physical Review E, 62, 7092-7099 (2000)
[16] Schölkopf, B.; Burges, C.; Smola, A. J., Advances in kernel methods: Support vector learning (1998), Cambridge University Press: Cambridge University Press Cambridge, UK · Zbl 0935.68084
[17] Schölkopf, B.; Smola, A.; Williamson, R.; Bartlett, P. L., New support vector algorithms, Neural Computation, 12, 5, 1207-1245 (2000)
[18] Smola, A. J.; Bartlett, P. L.; Schölkopf, B.; Schuurmans, D., Advances in large margin classifiers (2000), MIT Press: MIT Press Cambridge, MA · Zbl 0988.68145
[19] Vapnik, V. N., The nature of statistical learning theory (1995), Springer: Springer New York · Zbl 0934.62009
[20] Vapnik, V. N., Statistical learning theory (1998), Wiley: Wiley New York · Zbl 0934.62009
[21] Vapnik, V. N.; Chervonenkis, A. Y., On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probability and its Applications, 16, 264-280 (1971) · Zbl 0247.60005
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.