×

Two-sample statistics based on anisotropic kernels. (English) Zbl 1471.62346

Summary: The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between \(n\) data points and a set of \(n_R\) reference points, where \(n_R\) can be drastically smaller than \(n\). While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as \(\|p-q\|\sqrt{n}\to\infty\), and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.

MSC:

62G20 Asymptotic properties of nonparametric inference
62H15 Hypothesis testing in multivariate analysis
62H35 Image analysis in multivariate analysis
46E22 Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces)