×

Clustering in Hilbert’s projective geometry: the case studies of the probability simplex and the elliptope of correlation matrices. (English) Zbl 1418.53022

Nielsen, Frank (ed.), Geometric structures of information. Proceedings of the conference on geometric science of information, GSI 2017. Cham: Springer. Signals Commun. Technol., 297-331 (2019).
Summary: Clustering categorical distributions in the probability simplex is a fundamental task met in many applications dealing with normalized histograms. Traditionally, differential-geometric structures of the probability simplex have been used either by (i) setting the Riemannian metric tensor to the Fisher information matrix of the categorical distributions, or (ii) defining the dualistic information-geometric structure induced by a smooth dissimilarity measure, the Kullback-Leibler divergence. In this work, we introduce for this clustering task a novel computationally-friendly framework for modeling the probability simplex termed Hilbert simplex geometry. In the Hilbert simplex geometry, the distance function is described by a polytope. We discuss the pros and cons of those different statistical modelings, and benchmark experimentally these geometries for center-based \(k\)-means and \(k\)-center clusterings. Furthermore, since a canonical Hilbert metric distance can be defined on any bounded convex subset of the Euclidean space, we also consider Hilbert’s projective geometry of the elliptope of correlation matrices and study its clustering performances.
For the entire collection see [Zbl 1407.62027].

MSC:

53B20 Local Riemannian geometry
60E05 Probability distributions: general theory
51A05 General theory of linear incidence geometry and projective geometries
51M35 Synthetic treatment of fundamental manifolds in projective geometries (Grassmannians, Veronesians and their generalizations)

Software:

PMTK

References:

[1] Agresti, A.: Categorical Data Analysis, vol. 482. Wiley, New Jercy (2003) · Zbl 0716.62001
[2] Aggarwal, C.C., Zhai, C.X.: Mining Text Data. Springer Publishing Company, Berlin (2012) · doi:10.1007/978-1-4614-3223-4
[3] Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: International Conference on Computer Vision, pp. 104-111. IEEE (2009)
[4] Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012) · Zbl 1295.68003
[5] Chaudhuri, K., McGregor, A.: Finding metric structure in information theoretic clustering. In: Conference on Learning Theory (COLT), pp. 391-402 (2008)
[6] Lebanon, G.: Learning Riemannian metrics. In: Conference on Uncertainty in Artificial Intelligence (UAI), pp. 362-369 (2002)
[7] Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manag. 43(5), 1260-1280 (2007) · doi:10.1016/j.ipm.2006.11.001
[8] Huang, Z.: Extensions to the \(k\)-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283-304 (1998) · doi:10.1023/A:1009769707641
[9] Arthur, D., Vassilvitskii, \(S.: k\)-means++: the advantages of careful seeding. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1027-1035 (2007) · Zbl 1302.68273
[10] Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293-306 (1985) · Zbl 0567.62048 · doi:10.1016/0304-3975(85)90224-5
[11] Tropp, J.A.: Simplicial faces of the set of correlation matrices. Discret. Comput. Geom. 60(2), 512-529 (2018) · Zbl 1400.15035 · doi:10.1007/s00454-017-9961-0
[12] Kass, R.E., Vos, P.W.: Geometrical Foundations of Asymptotic Inference. Wiley Series in Probability and Statistics. Wiley-Interscience, New Jercy (1997) · Zbl 0880.62005 · doi:10.1002/9781118165980
[13] Hotelling, H.: Spaces of statistical parameters. Bull. Amer. Math. Soc. 36, 191 (1930) · JFM 56.0463.24
[14] Rao, C.R.: Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 37(3), 81-91 (1945) · Zbl 0063.06420
[15] Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. Breakthroughs in Statistics, pp. 235-247. Springer, New York (1992) · doi:10.1007/978-1-4612-0919-5_16
[16] Stigler, S.M.: The epic story of maximum likelihood. Stat. Sci. 22(4), 598-620 (2007) · Zbl 1246.01016 · doi:10.1214/07-STS249
[17] Amari, Si: Information Geometry and Its Applications. Applied Mathematical Sciences, vol. 194. Springer, Japan (2016) · Zbl 1350.94001
[18] Calin, O., Udriste, C.: Geometric Modeling in Probability and Statistics. Mathematics and Statistics. Springer International Publishing, New York (2014) · Zbl 1325.60001 · doi:10.1007/978-3-319-07779-6
[19] Amari, Si, Cichocki, A.: Information geometry of divergence functions. Bull. Pol. Acad. Sci.: Tech. Sci. 58(1), 183-195 (2010)
[20] Shima, H.: The Geometry of Hessian Structures. World Scientific, Singapore (2007) · Zbl 1244.53004 · doi:10.1142/6241
[21] Liang, X.: A note on divergences. Neural Comput. 28(10), 2045-2062 (2016) · Zbl 1474.68257 · doi:10.1162/NECO_a_00878
[22] Jenssen, R., Principe, J.C., Erdogmus, D., Eltoft, T.: The Cauchy-Schwarz divergence and Parzen windowing: connections to graph theory and mercer kernels. J. Frankl. Inst. 343(6), 614-629 (2006) · Zbl 1105.93001 · doi:10.1016/j.jfranklin.2006.03.018
[23] Hilbert, D.: Über die gerade linie als kürzeste verbindung zweier punkte. Mathematische Annalen 46(1), 91-96 (1895) · JFM 26.0540.02 · doi:10.1007/BF02096204
[24] Busemann, H.: The Geometry of Geodesics. Pure and Applied Mathematics, vol. 6. Elsevier Science, Amsterdam (1955) · Zbl 0112.37002
[25] de la Harpe, P.: On Hilbert’s metric for simplices. Geometric Group Theory, vol. 1, pp. 97-118. Cambridge University Press, Cambridge (1991) · Zbl 0832.52002
[26] Lemmens, B., Nussbaum, R.: Birkhoff’s version of Hilbert’s metric and its applications in analysis. Handbook of Hilbert Geometry, pp. 275-303 (2014)
[27] Richter-Gebert, J.: Perspectives on Projective Geometry: A Guided Tour Through Real and Complex Geometry. Springer, Berlin (2011) · Zbl 1214.51001 · doi:10.1007/978-3-642-17286-1
[28] Bi, Y., Fan, B., Wu, F.: Beyond Mahalanobis metric: Cayley-Klein metric learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2339-2347 (2015)
[29] Nielsen, F., Muzellec, B., Nock, R.: Classification with mixtures of curved Mahalanobis metrics. In: IEEE International Conference on Image Processing (ICIP), pp. 241-245 (2016)
[30] Nielsen, F., Muzellec, B., Nock, R.: Large margin nearest neighbor classification using curved Mahalanobis distances (2016). arXiv:1609.07082 [cs.LG]
[31] Stillwell, J.: Ideal elements in Hilbert’s geometry. Perspect. Sci. 22(1), 35-55 (2014) · Zbl 1295.51002 · doi:10.1162/POSC_a_00117
[32] Bernig, A.: Hilbert geometry of polytopes. Archiv der Mathematik 92(4), 314-324 (2009) · Zbl 1171.53046 · doi:10.1007/s00013-009-3142-1
[33] Nielsen, F., Sun, K.: Clustering in Hilbert simplex geometry. CoRR arXiv: abs/1704.00454 (2017)
[34] Nielsen, F., Shao, L.: On balls in a polygonal Hilbert geometry. In: 33st International Symposium on Computational Geometry (SoCG 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017) · Zbl 1436.51004
[35] Laidlaw, D.H., Weickert, J.: Visualization and Processing of Tensor Fields: Advances and Perspectives. Mathematics and Visualization. Springer, Berlin (2009) · Zbl 1159.68003 · doi:10.1007/978-3-540-88378-4
[36] Lemmens, B., Walsh, C.: Isometries of polyhedral Hilbert geometries. J. Topol. Anal. 3(02), 213-241 (2011) · Zbl 1220.53090 · doi:10.1142/S1793525311000520
[37] Condat, L.: Fast projection onto the simplex and the \(\ell_1\) ball. Math. Program. 158(1-2), 575-585 (2016) · Zbl 1347.49050 · doi:10.1007/s10107-015-0946-6
[38] Park, P.S.: Regular polytopic distances. Forum Geom. 16, 227-232 (2016) · Zbl 1341.51041
[39] Boissonnat, J.D., Sharir, M., Tagansky, B., Yvinec, M.: Voronoi diagrams in higher dimensions under certain polyhedral distance functions. Discret. Comput. Geom. 19(4), 485-519 (1998) · Zbl 0897.68113 · doi:10.1007/PL00009366
[40] Bengtsson, I., Zyczkowski, K.: Geometry of Quantum States: An Introduction to Quantum Entanglement. Cambridge University Press, Cambridge (2017) · Zbl 1392.81005 · doi:10.1017/9781139207010
[41] Nielsen, F.: Cramér-Rao lower bound and information geometry. Connected at Infinity II, pp. 18-37. Springer, Berlin (2013) · Zbl 1296.62015 · doi:10.1007/978-93-86279-56-9_2
[42] Chapman, D.G.: Minimum variance estimation without regularity assumptions. Ann. Math. Stat. 22(4), 581-586 (1951) · Zbl 0044.34302 · doi:10.1214/aoms/1177729548
[43] Hammersley, H.: On estimating restricted parameters. J. R. Stat. Society. Ser. B (Methodol.) 12(2), 192-240 (1950) · Zbl 0040.22202
[44] Nielsen, F., Sun, K.: On Hölder projective divergences. Entropy 19(3), 122 (2017) · doi:10.3390/e19030122
[45] Nielsen, F., Nock, R.: Further heuristics for \(k\)-means: the merge-and-split heuristic and the \((k,l)\)-means. arXiv:1406.6314 (2014)
[46] Bachem, O., Lucic, M., Hassani, S.H., Krause, A.: Approximate \(k\)-means++ in sublinear time. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1459-1467 (2016)
[47] Nielsen, F., Nock, R.: Total Jensen divergences: definition, properties and \(k\)-means++ clustering (2013). arXiv:1309.7109 [cs.IT]
[48] Ackermann, M.R., Blömer, J.: Bregman clustering for separable instances. Scandinavian Workshop on Algorithm Theory, pp. 212-223. Springer, Berlin (2010) · Zbl 1285.68212
[49] Manthey, B., Röglin, H.: Worst-case and smoothed analysis of \(k\)-means clustering with Bregman divergences. J. Comput. Geom. 4(1), 94-132 (2013) · Zbl 1404.68197
[50] Endo, Y., Miyamoto, S.: Spherical \(k\)-means++ clustering. Modeling Decisions for Artificial Intelligence, pp. 103-114. Springer, Berlin (2015) · Zbl 1365.62236 · doi:10.1007/978-3-319-23240-9_9
[51] Nielsen, F., Nock, R., Amari, Si: On clustering histograms with \(k\)-means by using mixed \(\alpha \)-divergences. Entropy 16(6), 3273-3301 (2014) · Zbl 1341.62186 · doi:10.3390/e16063273
[52] Brandenberg, R., König, S.: No dimension-independent core-sets for containment under homothetics. Discret. Comput. Geom. 49(1), 3-21 (2013) · Zbl 1258.68168 · doi:10.1007/s00454-012-9462-0
[53] Panigrahy, R.: Minimum enclosing polytope in high dimensions (2004). arXiv:cs/0407020 [cs.CG]
[54] Saha, A., Vishwanathan, S., Zhang, X.: New approximation algorithms for minimum enclosing convex shapes. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1146-1160 (2011) · Zbl 1377.68325 · doi:10.1137/1.9781611973082.86
[55] Nielsen, F., Nock, R.: On the smallest enclosing information disk. Inf. Process. Lett. 105(3), 93-97 (2008) · Zbl 1184.68568 · doi:10.1016/j.ipl.2007.08.007
[56] Sharir, M., Welzl, E.: A combinatorial bound for linear programming and related problems. STACS 92, 567-579 (1992) · Zbl 1494.68276
[57] Welzl, E.: Smallest enclosing disks (balls and ellipsoids). New Results and New trends in Computer Science, pp. 359-370. Springer, Berlin (1991) · doi:10.1007/BFb0038202
[58] Nielsen, F., Nock, R.: Approximating smallest enclosing balls with applications to machine learning. Int. J. Comput. Geom. Appl. 19(05), 389-414 (2009) · Zbl 1192.65026 · doi:10.1142/S0218195909003039
[59] Arnaudon, M., Nielsen, F.: On approximating the Riemannian \(1\)-center. Comput. Geom. 46(1), 93-104 (2013) · Zbl 1259.65036 · doi:10.1016/j.comgeo.2012.04.007
[60] Bâdoiu, M., Clarkson, K.L.: Smaller core-sets for balls. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 801-802 (2003) · Zbl 1092.68660 · doi:https://scholar.google.com/scholar?q=B%C3%A2doiu%2C M.%2C Clarkson%2C K.L.: Smaller core-sets for balls. In: ACM-SIAM Symposium on Discrete Algorithms (SODA)%2C pp. 801%E2%80%93802 (2003)
[61] Nielsen, F., Hadjeres, G.: Approximating covering and minimum enclosing balls in hyperbolic geometry. International Conference on Networked Geometric Science of Information, pp. 586-594. Springer, Cham (2015) · Zbl 1407.68552 · doi:10.1007/978-3-319-25040-3_63
[62] Bădoiu, M., Clarkson, K.L.: Optimal core-sets for balls. Comput. Geom. 40(1), 14-22 (2008) · Zbl 1138.68056 · doi:http://scholar.google.com/scholar_lookup?title=Optimal core-sets for balls&author=M. B%C4%83doiu&author=KL. Clarkson&journal=Comput. Geom.&volume=40&issue=1&pages=14-22&publication_year=2008
[63] Bachem, O., Lucic, M., Krause, A.: Scalable and distributed clustering via lightweight coresets (2017). arXiv:1702.08248 [stat.ML]
[64] Nielsen, F., Nock, R.: On approximating the smallest enclosing Bregman balls. In: Proceedings of the Twenty-Second Annual Symposium on Computational Geometry, pp. 485-486. ACM (2006) · Zbl 1153.68541
[65] Nock, R., Nielsen, F.: Fitting the smallest enclosing Bregman ball. ECML, pp. 649-656. Springer, Berlin (2005)
[66] Deza, M., Sikirić, M.D.: Voronoi polytopes for polyhedral norms on lattices. Discret. Appl. Math. 197, 42-52 (2015) · Zbl 1342.52015 · doi:10.1016/j.dam.2014.09.007
[67] Körner, M.C.: Minisum hyperspheres, Springer Optimization and Its Applications, vol. 51. Springer, New York (2011) · Zbl 1248.49001 · doi:10.1007/978-1-4419-9807-1
[68] Reem, D.: The geometric stability of Voronoi diagrams in normed spaces which are not uniformly convex (2012). arXiv:1212.1094 [cs.CG]
[69] Foertsch, T., Karlsson, A.: Hilbert metrics and Minkowski norms. J. Geom. 83(1-2), 22-31 (2005) · Zbl 1084.52008 · doi:10.1007/s00022-005-0005-1
[70] Cencov, N.N.: Statistical Decision Rules and Optimal Inference. Translations of Mathematical Monographs, vol. 53. American Mathematical Society, Providence (2000)
[71] Cena, A.: Geometric structures on the non-parametric statistical manifold. Ph.D. thesis, University of Milano (2002)
[72] Shen, Z.: Riemann-Finsler geometry with applications to information geometry. Chin. Ann. Math. Ser. B 27(1), 73-94 (2006) · Zbl 1107.53013 · doi:10.1007/s11401-005-0333-3
[73] Khosravifard, M., Fooladivanda, D., Gulliver, T.A.: Confliction of the convexity and metric properties in \(f\)-divergences. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 90(9), 1848-1853 (2007) · doi:10.1093/ietfec/e90-a.9.1848
[74] Dowty, J.G.: Chentsov’s theorem for exponential families (2017). arXiv:1701.08895 [math.ST] · Zbl 1409.62021
[75] Doup, T.M.: Simplicial Algorithms on the Simplotope, vol. 318. Springer Science & Business Media, Berlin (2012) · Zbl 0697.90046
[76] Vernicos, C.: Introduction aux géométries de Hilbert. Séminaire de théorie spectrale et géométrie 23, 145-168 (2004) · Zbl 1100.53031 · doi:10.5802/tsg.236
[77] Arnaudon, M., Nielsen, F.: Medians and means in Finsler geometry. LMS J. Comput. Math. 15, 23-37 (2012) · Zbl 1293.53025 · doi:10.1112/S1461157010000513
[78] Papadopoulos, A., Troyanov, M.: From Funk to Hilbert geometry (2014)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.