×

On the Schoenberg transformations in data analysis: theory and illustrations. (English) Zbl 1360.62313

Summary: The class of Schoenberg transformations, embedding Euclidean distances into higher dimensional Euclidean spaces, is presented, and derived from theorems on positive definite and conditionally negative definite matrices. Original results on the arc lengths, angles and curvature of the transformations are proposed, and visualized on artificial data sets by classical multidimensional scaling. A distance-based discriminant algorithm and a robust multidimensional centroid estimate illustrate the theory, closely connected to the Gaussian kernels of Machine Learning.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H25 Factor analysis and principal components; correspondence analysis

Software:

LIBRA

References:

[1] ALPAY, D., ATTIA, H., and LEVANONY, D. (2010), ”On the Characteristics of a Class of Gaussian Processes Within the White Noise Space Setting”, Stochastic Processes and their Applications, 120, 1074–1104. · Zbl 1197.60037 · doi:10.1016/j.spa.2010.03.004
[2] BAVAUD, F. (2006), ”Spectral Clustering and Multidimensional Scaling: A Unified View”, in Data Science and Classification, eds. V. Batagelj, H.-H. Bock, A. Ferligoj, and A. Ziberna, Heidelberg: Springer, pp. 131–139.
[3] BAVAUD, F. (2009), ”Aggregation Invariance in General Clustering Approaches”, Advances in Data Analysis and Classification, 3, 205–225. · Zbl 1306.62137 · doi:10.1007/s11634-009-0052-9
[4] BAVAUD, F. (2011), ”Robust Estimation of Location through Schoenberg Rransformations”, submitted.
[5] BENZÉCRI, J.-P., and collaborators. (1973), ”L’analyse des données. 1 : La taxinomie. 2 : L’analyse des correspondances”, Paris: Dunod. · Zbl 0297.62038
[6] BERNSTEIN, S. (1929), ”Sur les fonctions absolument monotones”, Acta Mathematica, 52, 1–66. · JFM 55.0142.07 · doi:10.1007/BF02592679
[7] BERG, C., MATEU, J., and PORCU, E. (2008), ”The Dagum Family of Isotropic Correlation Functions”, Bernoulli, 14, 1134–1149. · Zbl 1158.60350 · doi:10.3150/08-BEJ139
[8] BHATIA, R. (2006), ”Infinitely Divisible Matrices”, The American Mathematical Monthly, 113, 221–235. · Zbl 1132.15019 · doi:10.2307/27641890
[9] BLUMENTHAL, L.M. (1953), ”Theory and Applications of Distance Geometry”, Oxford: Oxford University Press. · Zbl 0050.38502
[10] BORG, I., and GROENEN, P.J.F. (1997), Modern Multidimensional Scaling: Theory and Applications, Heidelberg: Springer. · Zbl 0862.62052
[11] CAMPBELL, N.A. (1980), ”Robust Procedures in Multivariate Analysis I : Robust Covariance Estimation”, Applied Statistics, 29, 231–237. · Zbl 0471.62047 · doi:10.2307/2346896
[12] CHEN, D., HE, Q., and WANG, X. (2007), ”On Linear Separability of Data Sets in Feature Space”, Neurocomputing, 70, 2441–2448. · doi:10.1016/j.neucom.2006.12.002
[13] CHRISTAKOS, G. (1984), ”On the Problem of Permissible Covariance and Variogram Models”, Water Resources Research, 20, 251–265. · doi:10.1029/WR020i002p00251
[14] CRITCHLEY, F., and FICHET, B. (1994), ”The Partial Order by Inclusion of the Principal Classes of Dissimilarity on a Finite Set, and Some of Their Basic Properties”, in: Lecture Notes in Statistics. Classification and Dissimilarity Analysis, ed. B. van Cutsem, Heidelberg: Springer, pp. 5–65. · Zbl 0847.62048
[15] CRISTIANINI, N., and SHAWE-TAYLOR, J. (2003), ”An Introduction to Support Vector Machines and Other Kernel-based Learning Methods”, Cambridge: Cambridge University Press. · Zbl 0994.68074
[16] CUADRAS, C.M., and FORTINA, J. (1996), ”Weighted Continuous Metric Scaling”, in Multidimensional Statistical Analysis and Theory of Random Matrices, eds. A.K. Gupta and V.L. Girko, The Netherlands: VSP, pp. 27–40.
[17] CUADRAS, C.M., FORTINA, J., and OLIVA, F. (1997), ”The Proximity of an Individual to a Population with Applications in Discriminant Analysis”, Journal of Classification, 14, 117–136. · Zbl 0891.62043 · doi:10.1007/s003579900006
[18] FISHER, R.A. (1936), ”The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7, 179–188. · doi:10.1111/j.1469-1809.1936.tb02137.x
[19] FITZGERALD C. H., and HORN, R.A. (1977), ”On Fractional Hadamard Powers of Positive Definite Matrices”, Journal of Mathematical Analysis and Applications, 61, 633–642. · Zbl 0406.15006 · doi:10.1016/0022-247X(77)90167-6
[20] GOWER, J.C. (1966), ”Some Distance Properties of Latent Root and Vector Methods Used in Multivariate Analysis”, Biometrika, 53, 325–338. · Zbl 0192.26003 · doi:10.1093/biomet/53.3-4.325
[21] GOWER, J.C. (1982), ”Euclidean Distance Geometry”, Mathematical Scientist, 7, 1–14. · Zbl 0492.51017
[22] GREENACRE, M., and BLASIUS, J. (2006), ”Multiple Correspondence Analysis and Related Methods”, London: Chapman & Hall. · Zbl 1277.62156
[23] HAMPEL, F.R., RONCHETTI, E.M., ROUSSEEUW, P.J., and STAHEL, W.A. (1986), ”Robust Statistics – The Approach Based on Influence Functions”, New York: Wiley. · Zbl 0593.62027
[24] HÄRDLE, W. (1991), ”Smoothing Techniques with Implementation in S”, Heidelberg: Springer. · Zbl 0716.62040
[25] HAUSSLER, D. (1999), ”Convolution Kernels on Discrete Structures”, Technical Report, UCSC-CRL-99-10, University of California at Santa Cruz.
[26] HOFMANN, T., SCH”OLKOPF, B., and SMOLA, A.J. (2008), ”Kernel Methods in Machine Learning”, Annals of Statistics, 36, 1171–1220. · Zbl 1151.30007 · doi:10.1214/009053607000000677
[27] HORN, R.A., and JOHNSON, C.R. (1991), ”Topics in Matrix Analysis”, Cambridge: Cambridge University Press. · Zbl 0729.15001
[28] JOLY, S., and LE CALVE, G. (1986), ”Etude des puissances d’une distance”, Statistique et Analyse des Données, 11, 29–50. · Zbl 0636.62060
[29] KOLMOGOROV, A.N. (1940), ”The Wiener Helix and Other Interesting Curves in the Hilbert Space”, Doklady Akademii Nauk SSSR, 26, 115–118.
[30] LEBART, L., MORINEAU, A., and PIRON, M. (1998), Statistique exploratoire multidimensionnelle, Paris: Dunod. · Zbl 0920.62077
[31] MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979), Multivariate Analysis, New York: Academic Press. · Zbl 0432.62029
[32] MEULMAN, J.J., VAN DER KOOIJ, A., and HEISER, W.J. (2004), ”Principal Components Analysis with Nonlinear Optimal Scaling Transformations for Ordinal and Nominal Data”, in The Sage Handbook of Quantitative Methodology for the Social Sciences, ed. D. Kaplan, pp. 49–70.
[33] MICCHELLI, C. (1986), ”Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Functions”, Constructive Approximation, 2, 11–22. · Zbl 0625.41005 · doi:10.1007/BF01893414
[34] VON NEUMANN, J., and SCHOENBERG, I.J. (1941), ”Fourier Integrals and Metric Geometry”, Transactions of the American Mathematical Society, 50, 226–251. · Zbl 0028.41002 · doi:10.1090/S0002-9947-1941-0004644-8
[35] RAO, C.R. (1964), ”The Use and Interpretation of Principal Component Analysis in Applied Research”, Sankhya A, 26, 329–358. · Zbl 0137.37207
[36] SCHILLING, R., SONG, R., and VONDRAČEK, Z. (2010), Bernstein Functions: Theory and Applications, Studies in Mathematics, 37, Berlin: de Gruyter.
[37] SCHOENBERG, I.J. (1937), ”On Certain Metric Spaces Arising From Euclidean Spaces by a Change of Metric and Their Imbedding in Hilbert Space”, The Annals of Mathematics, 38, 787–793. · Zbl 0017.36101 · doi:10.2307/1968835
[38] SCHOENBERG, I.J. (1938a), ”Metric Spaces and Completely Monotone Functions”, The Annals of Mathematics, 39, 811–841. · JFM 64.0617.03 · doi:10.2307/1968466
[39] SCHOENBERG, I.J. (1938b), ”Metric Spaces and Positive Definite Functions”, Transactions of the American Mathematical Society, 44, 522–536. · Zbl 0019.41502 · doi:10.1090/S0002-9947-1938-1501980-0
[40] SCHÖLKOPF, B. (2000), ”The Kernel Trick for Distances”, Advances in Neural Information Processing Systems, 13, 301–307.
[41] STEIN, M.L. (1999), ”Interpolation of Spatial Data: Some Theory for Kriging”, Heidelberg: Springer. · Zbl 0924.62100
[42] TORGESON,W.S. (1958), ’Theory and Methods of Scaling, New York: Wiley.
[43] VAPKNIK, V.N. (1995), The Nature of Statistical Learning Theory, Heidelberg: Springer.
[44] VERBOVEN, S., and HUBERT, M. (2005), ”LIBRA: a MATLAB library for Robust Analysis”, Chemometrics and Intelligent Laboratory Systems, 75, 127–136. · doi:10.1016/j.chemolab.2004.06.003
[45] YOUNG, G., and HOUSEHOLDER, A.S. (1938), ”Discussion of a Set of Points in Terms of Their Mutual Distances”, Psychometrika, 3, 19–22. · JFM 64.1302.04 · doi:10.1007/BF02287916
[46] WILLIAMS, C.K.I. (2002), ”On a Connection Between Kernel PCA and Metric Multidimensional Scaling”, Machine Learning, 46, 11–19. · Zbl 1052.68118 · doi:10.1023/A:1012485807823
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.