Document Zbl 1489.62115

Near-optimal coresets of kernel density estimates. (English) Zbl 1489.62115

Speckmann, Bettina (ed.) et al., 34th international symposium on computational geometry, SoCG 2018, June 11–14, 2018, Budapest, Hungary. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik. LIPIcs – Leibniz Int. Proc. Inform. 99, Article 66, 13 p. (2018).

Summary: We construct near-optimal coresets for kernel density estimate for points in \(\mathbb{R}^d\) when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size \(O(\sqrt{d\log (1/\varepsilon)}/\varepsilon)\), and we show a near-matching lower bound of size \(\Omega(\sqrt{d}/\varepsilon)\). The upper bound is a polynomial in \(1/\varepsilon\) improvement when \(d\in[3,1/\varepsilon^2)\) (for all kernels except the Gaussian kernel which had a previous upper bound of \(O((1/\varepsilon)\log^d(1/\varepsilon)))\) and the lower bound is the first known lower bound to depend on \(d\) for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.
For the entire collection see [Zbl 1390.68027].

MSC:

62G07	Density estimation
62G05	Nonparametric estimation
62G20	Asymptotic properties of nonparametric inference
62H30	Classification and discrimination; cluster analysis (statistical aspects)

Keywords:

coresets; kernel density estimate; discrepancy

Cite Review PDF

Full Text: DOI

References:

[1]	Ery Arias-Castro, David Mason, and Bruno Pelletier. On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Journal of Machine Learning Research, 17(43):1-28, 2016. ##img## · Zbl 1360.62150
[2]	Francis Bach, Simon Lacoste-Julien, and Guillaume Obozinski. On the equivalence between herding and conditional gradient algorithms. In ICML 2012 International Conference on Machine Learning, 2012. ##img##
[3]	Wojciech Banaszczyk. Balancing vectors and gaussian measures of n-dimensional convex bodies. Random Structures &Algorithms, 12(4):351-360, 1998. ##img## · Zbl 0958.52004
[4]	Nikhil Bansal, Daniel Dadush, Shashwat Garg, and Shachar Lovett. The Gram-Schmidt walk: A cure for the Banaszczyk blues (to appear). Proceedings of the fiftieth annual ACM symposium on Theory of computing, 2018. ##img## · Zbl 1427.68328
[5]	Jon Louis Bentley and James B. Saxe. Decomposable searching problems I: Static-to-dynamic transformations. Journal of Algorithms, 1(4), 1980. ##img## · Zbl 0461.68065
[6]	Omer Bobrowski, Sayan Mukherjee, and Jonathan E. Taylor. Topological consistency via kernel estimation. Bernoulli, 23:288-328, 2017. ##img## · Zbl 1395.62073
[7]	Bernard Chazelle. The Discrepancy Method. Cambridge, 2000. ##img## · Zbl 0960.65149
[8]	Bernard Chazelle and Jiri Matousek. On linear-time deterministic algorithms for optimization problems in fixed dimensions. J. Algorithms, 21:579-597, 1996. ##img## · Zbl 0864.68040
[9]	Luc Devroye and László Györfi. Nonparametric Density Estimation: The L₁ View. Wiley, 1984. ##img## · Zbl 0546.62015
[10]	Petros Drineas and Michael W. Mahoney. On the Nyström method for approximating a Gram matrix for improved kernel-based learning. JLMR, 6:2153-2175, 2005. ##img## · Zbl 1222.68186
[11]	Jianqing Fan and Irene Gijbels. Local polynomial modelling and its applications: monographs on statistics and applied probability 66, volume 66. CRC Press, 1996. ##img##
[12]	Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. Confidence sets for persistence diagrams. The Annals of Statistics, 42:2301-2339, 2014. ##img## · Zbl 1310.62059
[13]	Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Scholkopf, and Alexander Smola. A kernel two-sample test. Journal of Machine Learning Research, 13:723-773, 2012. ##img## · Zbl 1283.62095
[14]	Matthias Hein and Olivier Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In AISTATS, pages 136-143, 2005. ##img##
[15]	Thomas Hofmann, Bernhard Schölkopf, and Alexander J. Smola. A review of kernel methods in machine learning. Technical Report 156, Max Planck Institute for Biological Cybernetics, 2006. ##img## · Zbl 1151.30007
[16]	Sarang Joshi, Raj Varma Kommaraji, Jeff M Phillips, and Suresh Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In Proceedings of the twenty-seventh annual symposium on Computational geometry, pages 47-56. ACM, 2011. ##img## · Zbl 1283.68368
[17]	Jiri Matousek. Geometric Discrepancy; An Illustrated Guide, 2nd printing. Springer-Verlag, 2010. ##img## · Zbl 1197.11092
[18]	Jiri Matousek, Aleksandar Nikolov, and Kunal Talwar. Factorization norms and hereditary discrepancy. arXiv preprint arXiv:1408.1376, 2014. ##img## · Zbl 1459.11161
[19]	Jeff M. Phillips. Algorithms for ε-approximations of terrains. In ICALP, 2008. ##img## · Zbl 1153.68568
[20]	Jeff M Phillips. ε-samples for kernels. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pages 1622-1632. SIAM, 2013. ##img## · Zbl 1422.68256
[21]	Jeff M Phillips and Wai Ming Tai. Improved coresets for kernel density estimates. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2718-2727. SIAM, 2018. ##img## · Zbl 1403.68207
[22]	Jeff M. Phillips, Bei Wang, and Yan Zheng. Geometric inference on kernel density estimates. In SOCG, 2015. ##img## · Zbl 1422.68257
[23]	Alessandro Rinaldo and Larry Wasserman. Generalized density clustering. The Annals of Statistics, pages 2678-2722, 2010. ##img## · Zbl 1200.62066
[24]	Isaac J Schoenberg. Metric spaces and completely monotone functions. Annals of Mathematics, pages 811-841, 1938. ##img## · JFM 64.0617.03
[25]	Bernhard Scholkopf and Alexander J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002. ##img##
[26]	Erich Schubert, Arthur Zimek, and Hans-Peter Kriegel. Generalized outlier detection with flexible kernel density estimates. In Proceedings of the 2014 SIAM International Conference on Data Mining, pages 542-550. SIAM, 2014. ##img## · Zbl 1281.68192
[27]	David W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, 1992. ##img## · Zbl 0850.62006
[28]	Bernard W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman &Hall/CRC, 1986. ##img## · Zbl 0617.62042
[29]	Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert R. G. Lanckriet. Hilbert space embeddings and metrics on probability measures. JMLR, 11:1517-1561, 2010. ##img## · Zbl 1242.60005
[30]	Yan Zheng and Jeff M. Phillips. l_∞ error and bandwidth selection for kernel density estimates of large data. In KDD, 2015. ##img##

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.