\`x^2+y_1+z_12^34\`
Article Contents
Article Contents

On Wasserstein distances for affine transformations of random vectors

  • *Corresponding author: Keaton Hamm

    *Corresponding author: Keaton Hamm 
Abstract / Introduction Full Text(HTML) Figure(11) Related Papers Cited by
  • We expound on some known lower bounds of the quadratic Wasserstein distance between random vectors in $ \mathbb{R}^n $ with an emphasis on affine transformations that have been used in manifold learning of data in Wasserstein space. In particular, we give concrete lower bounds for rotated copies of random vectors in $ \mathbb{R}^2 $ by computing the Bures metric between the covariance matrices. We also derive upper bounds for compositions of affine maps which yield a fruitful variety of diffeomorphisms applied to an initial data measure. We apply these bounds to various distributions including those lying on a 1-dimensional manifold in $ \mathbb{R}^2 $ and illustrate the quality of the bounds. Finally, we give a framework for mimicking handwritten digit or alphabet datasets that can be applied in a manifold learning framework.

    Mathematics Subject Classification: Primary: 60D05, 49Q22; Secondary: 41A29.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Rotation of $ [0,1]^2 $ from Example 2.13. (Left) $ W_2(X,R_\theta X) $ and the lower bound $ \sqrt{1-\cos(\theta)} $ for $ \theta\in[0,\frac\pi2] $. (Right) Relative error $ (W_2(X,R_\theta X)-\sqrt{1-\cos(\theta)})/W_2(X,R_\theta X) $ for $ \theta\in[0,\frac{\pi}{2}] $

    Figure 2.  Rotation of $ [-\frac12,\frac12]^2 $ from Example 2.14. Plotted is $ W_2(X,R_\theta X) $ for $ \theta\in[0,\frac{\pi}{2}] $

    Figure 3.  Rotation of $ [0,2]\times[0,1] $ from Example 2.15. (Left) $ W_2(X,R_\theta X) $ and the lower bound of (8) for $ \theta\in[0,\frac\pi2] $. (Right) Relative error $ (W_2(X,R_\theta X)- $lower bound)$ /W_2(X,R_\theta X) $ for $ \theta\in[0,\frac\pi2] $

    Figure 4.  Figure illustrating Example 2.20. (Top) Sample of $ X\sim\mathcal{N}(0,\Sigma_1) $ under the map $ T_\alpha\circ R_\theta\circ S_\lambda $ for $ \theta\in[0,\frac{\pi}{2}] $ with $ \alpha = 0 $, $ \lambda = \begin{bmatrix} \frac12 \\ 2 \end{bmatrix} $. (Bottom) Relative error (upper bound $ - W_2(T_\alpha\circ R_\theta\circ S_\lambda X,X))/W_2(T_\alpha\circ R_\theta\circ S_\lambda X,X) $ for $ X\sim(0,\Sigma_i) $, $ i = 1,2,3 $ using the upper bound of Theorem 2.17 (which is valid because Gaussians satisfy equality in Theorem 2.10). Error bars show one standard deviation of the error over 20 repeated trials

    Figure 5.  (Left) $ X $ drawn from the uniform distribution on the unit circle in $ \mathbb{R}^2 $ as in Example 3.1. Shown is the relative error between the Wasserstein distance and the upper bound. (Right) Relative error between Wasserstein distance and upper bound for $ X $ drawn from the uniform distribution on $ [-\frac12,\frac12] $ in $ \mathbb{R}^2 $ as in Example 3.2

    Figure 6.  (Left) $ X_1 $ and $ X_2 $ vs. $ t $. (Right) $ X = \begin{bmatrix}X_1\\X_2\end{bmatrix} $ forming the letter C

    Figure 7.  (Left) $ X_1 $ and $ X_2 $ vs. $ t $. (Right) $ X = \begin{bmatrix}X_1\\X_2\end{bmatrix} $ forming the letter A

    Figure 8.  (Left) Letter A undergoing rotation. (Right) Relative error $ (W_2(R_\theta X,X)- $lower bound)$ /W_2(R_\theta X,X) $ for the lower bound of Theorem 2.10

    Figure 9.  (Left) $ X_1 $ and $ X_2 $ vs. $ t $ for the parametrization (11) (top) and (12) (bottom). (Right) $ X = \begin{bmatrix}X_1\\X_2\end{bmatrix} $ forming the letter T

    Figure 10.  (Left) Letter T undergoing rotation. (Center) Relative error $ (W_2(R_\theta X,X)- $lower bound)$ /W_2(R_\theta X,X) $ for the lower bound of Theorem 2.10 for the T parametrized by (11) (Right) Relative error for the T with nonzero mean parametrized by (12)

    Figure 11.  (Left) Wassmap embedding of rotated copies of letter T parametrized by (11). (Center) Same for T parametrized by (12). (Right) MDS embedding using the lower bound of Theorem 2.10 for T parametrized by (11)

  • [1] Y. Brenier, Polar factorization and monotone rearrangement of vector-valued functions, Communications on Pure and Applied Mathematics, 44 (1991), 375-417.  doi: 10.1002/cpa.3160440402.
    [2] D. Bures, An extension of Kakutani's theorem on infinite product measures to the tensor product of semifinite $\omega$*-algebras, Transactions of the American Mathematical Society, 135 (1969), 199-212.  doi: 10.2307/1995012.
    [3] G. Canas and L. Rosasco, Learning probability measures with respect to optimal transport metrics, Advances in Neural Information Processing Systems, 25 (2012).
    [4] Y. Chen, F. D. Cruz, R. Sandhu, A. L. Kung, P. Mundi, J. O. Deasy and A. Tannenbaum, Pediatric sarcoma data forms a unique cluster measured via the earth mover's distance, Scientific Reports, 7 (2017), Article number: 7035. doi: 10.1038/s41598-017-07551-8.
    [5] A. Cloninger, K. Hamm, V. Khurana and C. Moosmüller, Linearized wasserstein dimensionality reduction with approximation guarantees, arXiv preprint, arXiv: 2302.07373, 2023.
    [6] D. C. Dowson and B. V. Landau, The Fréchet distance between multivariate normal distributions, Journal of Multivariate Analysis, 12 (1982), 450-455.  doi: 10.1016/0047-259X(82)90077-X.
    [7] R. FlamaryN. CourtyA. GramfortM. Z. AlayaA. BoisbunonS. ChambonL. ChapelA. CorenflosK. FatrasN. FournierL. GautheronN. T. H. GayraudH. JanatiA. RakotomamonjyI. RedkoA. RoletA. SchutzV. SeguyD. J. SutherlandR. TavenardA. Tong and T. Vayer, Pot: Python optimal transport, Journal of Machine Learning Research, 22 (2021), 1-8. 
    [8] P. J. Forrester and M. Kieburg, Relating the Bures measure to the cauchy two-matrix model, Communications in Mathematical Physics, 342 (2016), 151-187.  doi: 10.1007/s00220-015-2435-4.
    [9] M. Gelbrich, On a formula for the $l^2$ wasserstein metric between measures on Euclidean and Hilbert spaces, Mathematische Nachrichten, 147 (1990), 185-203.  doi: 10.1002/mana.19901470121.
    [10] C. R. Givens and R. M. Shortt, A class of wasserstein metrics for probability distributions, Michigan Mathematical Journal, 31 (1984), 231-240.  doi: 10.1307/mmj/1029003026.
    [11] K. HammN. Henscheid and S. Kang, Wassmap: Wasserstein isometric mapping for image manifold learning, SIAM Journal on Mathematics of Data Science, 5 (2023), 475-501.  doi: 10.1137/22M1490053.
    [12] K. Hamm and V. Khurana, Lattice approximations in wasserstein space, arXiv preprint, arXiv: 2310.09149, 2023.
    [13] K. Hamm, C. Moosmüller, B. Schmitzer and M. Thorpe, Manifold learning in wasserstein space, arXiv preprint, arXiv: 2311.08549, 2023.
    [14] S. KolouriS. R. ParkM. ThorpeD. Slepcev and G. K. Rohde, Optimal mass transport: Signal processing and machine-learning applications, IEEE Signal Processing Magazine, 34 (2017), 43-59.  doi: 10.1109/MSP.2017.2695801.
    [15] Y. LeCun, The MNIST database of handwritten digits, http://yann.lecun.com/exdb/mnist/, 1998.
    [16] B. W. Levinger, The square root of a 2× 2 matrix, Mathematics Magazine, 53 (1980), 222-224.  doi: 10.1080/0025570X.1980.11976858.
    [17] X. Liu, Y. Bai, Y. Lu, A. Soltoggio and S. Kolouri, Wasserstein task embedding for measuring task similarities, arXiv preprint, arXiv: 2208.11726, 2022.
    [18] K. V. Mardia, Multivariate Analysis, Technical report, 1979.
    [19] J. C. MathewsM. PouryahyaC. MoosmüllerY. G. KevrekidisJ. O. Deasy and A. Tannenbaum, Molecular phenotyping using networks, diffusion, and topology: Soft tissue sarcoma, Scientific Reports, 9 (2019), 13982.  doi: 10.1038/s41598-019-50300-2.
    [20] Q. Mérigot, A. Delalande and F. Chazal, Quantitative stability of optimal transport maps and linearization of the 2-wasserstein space, In International Conference on Artificial Intelligence and Statistics, PMLR, 2020, 3186-3196.
    [21] E. Negrini and L. Nurbekyan, Applications of no-collision transportation maps in manifold learning, SIAM J. Math. Data Sci., 6 (2024), 97-126.  doi: 10.1137/23M1567771.
    [22] I. Olkin and F. Pukelsheim, The distance between two random vectors with given dispersion matrices, Linear Algebra and its Applications, 48 (1982), 257-263.  doi: 10.1016/0024-3795(82)90112-4.
    [23] G. Peyré and M. Cuturi, Computational optimal transport: With applications to data science, Foundations and Trends in Machine Learning, 11 (2019), 355-607. 
    [24] F. Santambrogio, Optimal transport for applied mathematicians, Calculus of variations, PDEs, and modeling, Volume 87 of Progress in Nonlinear Differential Equations and their Applications, Birkhäuser/Springer, Cham, 2015. doi: 10.1007/978-3-319-20828-2.
    [25] J. B. TenenbaumV. D. Silva and J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323.  doi: 10.1126/science.290.5500.2319.
    [26] C. Villani, Topics in Optimal Transportation, Grad. Stud. Math., 58, American Mathematical Society, Providence, RI, 2003. doi: 10.1090/gsm/058.
    [27] C. Villani, Optimal Transport: Old and New, Grundlehren Math. Wiss., 338[Fundamental Principles of Mathematical Sciences], Springer-Verlag, Berlin, 2009. doi: 10.1007/978-3-540-71050-9.
    [28] W. WangJ. A. OzolekD. SlepčevA. B. LeeC. Chen and G. K. Rohde, An optimal transportation approach for nuclear structure-based pathology, IEEE Transactions on Medical Imaging, 30 (2010), 621-631.  doi: 10.1109/TMI.2010.2089693.
    [29] W. WangD. SlepčevS. BasuJ. A. Ozolek and G. K. Rohde, A linear optimal transportation framework for quantifying and visualizing variations in sets of images, International Journal of Computer Vision, 101 (2013), 254-269.  doi: 10.1007/s11263-012-0566-z.
  • 加载中

Figures(11)

SHARE

Article Metrics

HTML views(562) PDF downloads(209) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return