×

Matrix versions of the Hellinger distance. (English) Zbl 1420.15016

Lett. Math. Phys. 109, No. 8, 1777-1804 (2019); correction ibid. 109, No. 12, 2779-2781 (2019).
Let \((p_{1},p_{2},\dots,p_{n})\) and \((q_{1},q_{2},\dots,q_{n})\) be two probability distributions. Then the Hellinger distance between them is defined to be \(\left\{ \sum_{i}(\frac{1}{2}(p_{i}+q_{i})-\sqrt{p_{i}q_{i}})\right\} ^{1/2}\). In terms of the diagonal matrices \(P:=\mathrm{diag}(p_{1},p_{2},\dots)\) and \(Q:=\mathrm{diag}(q_{1},q_{2},\dots,q_{n})\), this can be written \[ d_{H}(P,Q):=\sqrt{\operatorname{tr}\mathcal{A}(P,Q)-\operatorname{tr}\mathcal{G}(P,Q)}, \] where \(\mathcal{A}\) and \(\mathcal{G}\) represent the arithmetic and geometric means of \(P\) and \(Q\), respectively. The goal of the present paper is to examine some extensions of the above definition for general \(n\times n\) complex semipositive definite matrices.
Although there is a natural unique way to define \(\mathcal{A}\), there is more than one way to define the square root of a product of semipositive definite matrices and hence more than one way to define \(\mathcal{G}\). Let \(A\) and \(B\) be arbitrary semipositive definite matrices and let \(A^{1/2}\) and \(B^{1/2}\) be their (unique) positive semidefinite square roots. Write \(\left\Vert \ \right\Vert _{2}\) to denote the Frobenius norm and \(\mathbb{P}\) to denote the \(n\times n\) positive definite matrices. Then the following functions are considered: \(d_{1}(A,B):=\left\Vert A^{1/2}-B^{1/2}\right\Vert _{2}=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}A^{1/2}B^{1/2}\right\} ^{1/2}\); \(d_{2}(A,B):=\left\{ \operatorname{tr}(A+B)-\operatorname{tr}(A^{1/2}BA^{1/2})^{1/2}\right\} ^{1/2}\); \(d_{3}(A,B):=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}A\#B\right\} ^{1/2}\) where \(A\#B:=A^{1/2}(A^{-1/2} BA^{-1/2})^{1/2}A^{1/2}\); and \(d_{4}(A,B):=\left\{ \operatorname{tr}(A+B)-2\operatorname{tr}\mathcal{L} (A,B)\right\} ^{1/2}\) where \(\mathcal{L}(A,B):=\exp\left( \frac{1}{2}(\log A+\log B)\right) \) (defined only for strictly positive definite matrices). The functions \(d_{1}\) and \(d_{2}\) define metrics (\(d_{2}\) is sometimes called the Bures distance or Wasserstein metric) but \(d_{3}\) and \(d_{4}\) fail to satisfy the triangle inequality so do not define metrics. The main results of this paper concern the functions \(\Phi_{k}(A,B):=d_{k}(A,B)^{2}\) for \(k=3\) and \(4\). In particular, it is shown that \(\Phi_{3}\) and \(\Phi_{4}\) are divergence functions (see [S.-i. Amari [Information geometry and its applications. Tokyo: Springer (2016; Zbl 1350.94001)]) and have useful convexity properties such as (Theorem 8): for each \(A\in\mathbb{P}\) the function \(X\longmapsto\Phi_{4}(A,X)\) is strictly convex on \(\mathbb{P}\).

MSC:

15A60 Norms of matrices, numerical range, applications of functional analysis to matrix theory
51K05 General theory of distance geometry
15B48 Positive matrices and their generalizations; cones of matrices
49K35 Optimality conditions for minimax problems
94A17 Measures of information, entropy
81P45 Quantum information, communication, networks (quantum-theoretic aspects)

Citations:

Zbl 1350.94001

References:

[1] Abatzoglou, T.J.: Norm derivatives on spaces of operators. Math. Ann. 239, 129-135 (1979) · Zbl 0398.47013 · doi:10.1007/BF01420370
[2] Agueh, M., Carlier, G.: Barycenters in the Wasserstein space. SIAM J. Math. Anal. Appl. 43, 904-924 (2011) · Zbl 1223.49045 · doi:10.1137/100805741
[3] Aiken, J.G., Erdos, J.A., Goldstein, J.A.: Unitary approximation of positive operators. Illinois J. Math. 24, 61-72 (1980) · Zbl 0404.47014 · doi:10.1215/ijm/1256047797
[4] Amari, S.: Information Geometry and its Applications. Springer, Tokyo (2016) · Zbl 1350.94001 · doi:10.1007/978-4-431-55978-8
[5] Ando, T.: Concavity of certain maps on positive definite matrices and applications to Hadamard products. Linear Algebra Appl. 26, 203-241 (1979) · Zbl 0495.15018 · doi:10.1016/0024-3795(79)90179-4
[6] Ando, T., Li, C.-K., Mathias, R.: Geometric means. Linear Algebra Appl. 385, 305-334 (2004) · Zbl 1063.47013 · doi:10.1016/j.laa.2003.11.019
[7] Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Geometric means in a novel vector space structure on symmetric positive-definite matrices. SIAM J. Math. Anal. Appl. 29, 328-347 (2007) · Zbl 1144.47015 · doi:10.1137/050637996
[8] Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705-1749 (2005) · Zbl 1190.62117
[9] Barbaresco, F.: Innovative tools for radar signal processing based on Cartan’s geometry of SPD matrices and information geometry. In: IEEE Radar Conference, Rome (2008)
[10] Bauschke, H.H., Borwein, J.M.: Legendre functions and the method of random Bregman projections. J. Convex Anal. 4, 27-67 (1997) · Zbl 0894.49019
[11] Bauschke, H.H., Borwein, J.M.: Joint and separate convexity of the Bregman distance. Stud. Comput. Math. 8, 23-36 (2001) · Zbl 1160.65319 · doi:10.1016/S1570-579X(01)80004-5
[12] Bengtsson, I., Zyczkowski, K.: Geometry of Quantum States: An Introduction to Quantum Entanglement. Cambridge University Press, Cambridge (2006) · Zbl 1146.81004 · doi:10.1017/CBO9780511535048
[13] Bhagwat, K.V., Subramanian, R.: Inequalities between means of positive operators. Math. Proc. Camb. Philos. Soc. 83, 393-401 (1978) · Zbl 0375.47017 · doi:10.1017/S0305004100054670
[14] Bhatia, R.: Matrix Analysis. Springer, Tokyo (1997) · Zbl 0863.15001 · doi:10.1007/978-1-4612-0653-8
[15] Bhatia, R.: Positive Definite Matrices. Princeton University Press, Princeton (2007) · Zbl 1133.15017
[16] Bhatia, R.; Nielsen, F. (ed.); Bhatia, R. (ed.), The Riemannian mean of positive matrices, 35-51 (2013), Tokyo · Zbl 1271.15019 · doi:10.1007/978-3-642-30232-9_2
[17] Bhatia, R., Grover, P.: Norm inequalities related to the matrix geometric mean. Linear Algebra Appl. 437, 726-733 (2012) · Zbl 1252.15023 · doi:10.1016/j.laa.2012.03.001
[18] Bhatia, R., Jain, T., Lim, Y.: On the Bures-Wasserstein distance between positive definite matrices. Expos. Math. (2018). https://doi.org/10.1016/j.exmath.2018.01.002 · Zbl 1437.15044
[19] Bhatia, R., Jain, T., Lim, Y.: Strong convexity of sandwiched entropies and related optimization problems. Rev. Math. Phys. 30, 1850014 (2018) · Zbl 1402.49036 · doi:10.1142/S0129055X18500149
[20] Carlen, E.A., Lieb, E.H.: A Minkowski type trace inequality and strong subadditivity of quantum entropy. Adv. Math. Sci. AMS Transl. 180, 59-68 (1999) · Zbl 0933.47014
[21] Carlen, E.A., Lieb, E.H.: A Minkowski type trace inequality and strong subadditivity of quantum entropy. II. Convexity and concavity. Lett. Math. Phys. 83, 107-126 (2008) · Zbl 1171.47015 · doi:10.1007/s11005-008-0223-1
[22] Chebbi, Z., Moakher, M.: Means of Hermitian positive-definite matrices based on the log-determinant \[\alpha\] α-divergence function. Linear Algebra Appl. 436, 18721889 (2012) · Zbl 1236.15060 · doi:10.1016/j.laa.2011.12.003
[23] Dhillon, I.S., Tropp, J.A.: Matrix nearness problems with Bregman divergences. SIAM J. Matrix Anal. Appl. 29, 1120-1146 (2004) · Zbl 1153.65044 · doi:10.1137/060649021
[24] Fletcher, P., Joshi, S.: Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Process. 87, 250-262 (2007) · Zbl 1186.94126 · doi:10.1016/j.sigpro.2005.12.018
[25] Hiai, F., Mosonyi, M., Petz, D., Beny, C.: Quantum f-divergences and error correction. Rev. Math. Phys. 23, 691-747 (2011) · Zbl 1230.81007 · doi:10.1142/S0129055X11004412
[26] Jencová, A.: Geodesic distances on density matrices. J. Math. Phys. 45, 1787-1794 (2004) · Zbl 1071.81012 · doi:10.1063/1.1689000
[27] Jencova, A., Ruskai, M.B.: A unified treatment of convexity of relative entropy and related trace functions with conditions for equality. Rev. Math. Phys. 22, 1099-1121 (2010) · Zbl 1218.81025 · doi:10.1142/S0129055X10004144
[28] Lim, Y., Palfia, M.: Matrix power means and the Karcher mean. J. Funct. Anal. 262, 1498-1514 (2012) · Zbl 1244.15014 · doi:10.1016/j.jfa.2011.11.012
[29] Modin, K.: Geometry of matrix decompositions seen through optimal transport and information geometry. J. Geom. Mech. 9, 335-390 (2017) · Zbl 1368.15010 · doi:10.3934/jgm.2017014
[30] Nielsen, F., Bhatia, R. (eds.): Matrix Information Geometry. Springer, Tokyo (2013)
[31] Nielsen, F., Boltz, S.: The Burbea-Rao and Bhattacharyya centroids. IEEE Trans. Inf. Theory 57, 5455-5466 (2011) · Zbl 1365.94159 · doi:10.1109/TIT.2011.2159046
[32] Pitrik, J., Virosztek, D.: On the joint convexity of the Bregman divergence of matrices. Lett. Math. Phys. 105, 675-692 (2015) · Zbl 1330.47026 · doi:10.1007/s11005-015-0757-y
[33] Pusz, W., Woronowicz, S.L.: Functional calculus for sesquilinear forms and the purification map. Rep. Math. Phys. 8, 159-170 (1975) · Zbl 0327.46032 · doi:10.1016/0034-4877(75)90061-0
[34] Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970) · Zbl 0193.18401 · doi:10.1515/9781400873173
[35] Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Tokyo (1998) · Zbl 0888.49001 · doi:10.1007/978-3-642-02431-3
[36] Sra, S.: Positive definite matrices and the \[SS\]-divergence. Proc. Am. Math. Soc. 144, 2787-2797 (2016) · Zbl 1338.15046 · doi:10.1090/proc/12953
[37] Takatsu, A.: Wasserstein geometry of Gaussian measures. Osaka J. Math. 48, 1005-1026 (2011) · Zbl 1245.60013
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.