×

Analysis and algorithms for \(\ell_p\)-based semi-supervised learning on graphs. (English) Zbl 07557810

Summary: This paper addresses theory and applications of \(\ell_p\)-based Laplacian regularization in semi-supervised learning. The graph \(p\)-Laplacian for \(p > 2\) has been proposed recently as a replacement for the standard \((p = 2)\) graph Laplacian in semi-supervised learning problems with very few labels, where Laplacian learning is degenerate. In the first part of the paper we prove new discrete to continuum convergence results for \(p\)-Laplace problems on \(k\)-nearest neighbor \((k\)-NN) graphs, which are more commonly used in practice than random geometric graphs. Our analysis shows that, on \(k\)-NN graphs, the \(p\)-Laplacian retains information about the data distribution as \(p \to \infty\) and Lipschitz learning \((p = \infty )\) is sensitive to the data distribution. This situation can be contrasted with random geometric graphs, where the \(p\)-Laplacian forgets the data distribution as \(p \to \infty\). We also present a general framework for proving discrete to continuum convergence results in graph-based learning that only requires pointwise consistency and monotonicity. In the second part of the paper, we develop fast algorithms for solving the variational and game-theoretic \(p\)-Laplace equations on weighted graphs for \(p > 2\). We present several efficient and scalable algorithms for both formulations, and present numerical results on synthetic data indicating their convergence properties. Finally, we conduct extensive numerical experiments on the MNIST, FashionMNIST and EMNIST datasets that illustrate the effectiveness of the \(p\)-Laplacian formulation for semi-supervised learning with few labels. In particular, we find that Lipschitz learning \((p = \infty )\) performs well with very few labels on \(k\)-NN graphs, which experimentally validates our theoretical findings that Lipschitz learning retains information about the data distribution (the unlabeled data) on \(k\)-NN graphs.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

References:

[1] Adil, D.; Peng, R.; Sachdeva, S., Fast, provably convergent irls algorithm for p-norm linear regression, (Advances in Neural Information Processing Systems (2019)), 14166-14177
[2] Alamgir, M.; Luxburg, U. V., Phase transition in the family of p-resistances, (Advances in Neural Information Processing Systems (2011)), 379-387
[3] Ando, R. K.; Zhang, T., Learning on graph with Laplacian regularization, (Advances in Neural Information Processing Systems (2007)), 25-32
[4] Aronsson, G.; Crandall, M.; Juutinen, P., A tour of the theory of absolutely minimizing functions, Bull. Am. Math. Soc., 41, 4, 439-505 (2004) · Zbl 1150.35047
[5] Barles, G.; Souganidis, P. E., Convergence of approximation schemes for fully nonlinear second order equations, Asymptot. Anal., 4, 3, 271-283 (1991) · Zbl 0729.65077
[6] Bridle, N.; Zhu p-voltages, X., Laplacian regularization for semi-supervised learning on high-dimensional data, (Eleventh Workshop on Mining and Learning with Graphs (MLG2013) (2013))
[7] Bruna, J.; Mallat, S., Invariant scattering convolution networks, IEEE Trans. Pattern Anal. Mach. Intell., 35, 8, 1872-1886 (2013)
[8] Calder, J., The game theoretic p-Laplacian and semi-supervised learning with few labels, Nonlinearity, 32, 1 (2018)
[9] Calder, J., Lecture notes on viscosity solutions (2018), Online Lecture Notes
[10] Calder, J., Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data, SIAM J. Math. Data Sci., 1, 4, 780-812 (2019) · Zbl 1499.35598
[11] Calder, J., Calculus of variationsLecture notes (2020)
[12] Calder, J., GraphLearning Python package (2022)
[13] Calder, J.; Esedoglu, S.; Hero, A. O., A Hamilton-Jacobi equation for the continuum limit of nondominated sorting, SIAM J. Math. Anal., 46, 1, 603-638 (2014) · Zbl 1288.35190
[14] Calder, J.; García Trillos, N., Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs (2019)
[15] Calder, J.; Slepčev, D., Properly-weighted graph Laplacian for semi-supervised learning, Appl. Math. Optim., 1-49 (2019)
[16] Calder, J.; Slepčev, D.; Thorpe, M., Rates of convergence for Laplacian semi-supervised learning with low labeling rates (2020)
[17] Calder, J.; Smart, C. K., The limit shape of convex hull peeling, Duke Math. J., 169, 11, 2079-2124 (2020) · Zbl 1455.35047
[18] Chapelle, O.; Scholkopf, B.; Zien, A., Semi-Supervised Learning (2006), MIT
[19] Cohen, G.; Afshar, S.; Tapson, J.; Van Schaik, A., Emnist: extending mnist to handwritten letters, (2017 International Joint Conference on Neural Networks (IJCNN) (2017), IEEE), 2921-2926
[20] Crandall, M. G.; Ishii, H.; Lions, P.-L., User’s guide to viscosity solutions of second order partial differential equations, Bull. Am. Math. Soc., 27, 1, 1-67 (1992) · Zbl 0755.35015
[21] El Alaoui, A.; Cheng, X.; Ramdas, A.; Wainwright, M. J.; Jordan, M. I., Asymptotic behavior of \(\ell_p\)-based Laplacian regularization in semi-supervised learning, (Conference on Learning Theory (2016)), 879-906
[22] Elmoataz, A.; Desquesnes, X.; Toutain, M., On the game p-Laplacian on weighted graphs with applications in image processing and data clustering, Eur. J. Appl. Math., 28, 6, 922-948 (2017) · Zbl 1390.91039
[23] Elmoataz, A.; Lozes, F.; Toutain, M., Nonlocal PDEs on graphs: from tug-of-war games to unified interpolation on images and point clouds, J. Math. Imaging Vis., 57, 3, 381-401 (2017) · Zbl 1402.94017
[24] Elmoataz, A.; Toutain, M.; Tenbrinck, D., On the p-Laplacian and ∞-Laplacian on graphs with applications in image and data processing, SIAM J. Imaging Sci., 8, 4, 2412-2451 (2015) · Zbl 1330.35488
[25] Evans, L., Partial Differential Equations, Graduate Studies in Mathematics, vol. 19 (June 1998), American Mathematical Society · Zbl 0902.35002
[26] Fletcher, R.; Grant, J.; Hebden, M., The calculation of linear best l_p approximations, Comput. J., 14, 3, 276-279 (1971) · Zbl 0225.65017
[27] Flores, M., Algorithms for semisupervised learning on graphs (2018)
[28] Garcia Trillos, N., Variational limits of k-nn graph-based functionals on data clouds, SIAM J. Math. Data Sci., 1, 1, 93-120 (2019) · Zbl 1499.49058
[29] Trillos, N. García; Slepčev, D., Continuum limit of total variation on point clouds, Arch. Ration. Mech. Anal., 220, 1, 193-241 (2016) · Zbl 1336.68215
[30] Hafiene, Y.; Fadili, J.; Elmoataz, A., Nonlocal p-Laplacian variational problems on graphs (2018) · Zbl 1448.65200
[31] He, J.; Li, M.; Zhang, H.-J.; Tong, H.; Zhang, C., Manifold-ranking based image retrieval, (Proceedings of the 12th Annual ACM International Conference on Multimedia (2004), ACM), 9-16
[32] He, J.; Li, M.; Zhang, H.-J.; Tong, H.; Zhang, C., Generalized manifold-ranking-based image retrieval, IEEE Trans. Image Process., 15, 10, 3170-3177 (2006)
[33] Hein, M.; Audibert, J.-Y.; v Luxburg, U., Graph Laplacians and their convergence on random neighborhood graphs, J. Mach. Learn. Res., 8, 1325-1368 (Jun 2007) · Zbl 1222.68215
[34] Hein, M.; Audibert, J.-Y.; Von Luxburg, U., From graphs to manifolds-weak and strong pointwise consistency of graph Laplacians, (International Conference on Computational Learning Theory (2005), Springer), 470-485 · Zbl 1095.68097
[35] Jung, A.; Hero, A. O.; Mara, A.; Jahromi, S., Semi-supervised learning via sparse label propagation (2016), arXiv preprint
[36] Kyng, R.; Rao, A.; Sachdeva, S.; Spielman, D. A., Algorithms for Lipschitz learning on graphs, (Conference on Learning Theory (2015)), 1190-1223
[37] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 86, 11, 2278-2324 (1998)
[38] Leoni, G., A First Course in Sobolev Spaces (2017), American Mathematical Soc. · Zbl 1382.46001
[39] Lewicka, M.; Manfredi, J. J., Game theoretical methods in PDEs, Boll. UMI, 7, 3, 211-216 (2014) · Zbl 1322.35139
[40] Lindqvist, P., Notes on the p-Laplace Equation (2017)
[41] v Luxburg, U.; Bousquet, O., Distance-based classification with Lipschitz functions, J. Mach. Learn. Res., 5, 669-695 (Jun 2004) · Zbl 1222.68326
[42] Manfredi, J. J.; Oberman, A. M.; Sviridov, A. P., Nonlinear elliptic partial differential equations and p-harmonic functions on graphs, Differ. Integral Equ., 28, 1-2, 79-102 (2015) · Zbl 1349.35382
[43] Nadler, B.; Srebro, N.; Zhou, X., Semi-supervised learning with the graph Laplacian: the limit of infinite unlabelled data, Adv. Neural Inf. Process. Syst., 22, 1330-1338 (2009)
[44] Oberman, A. M., Convergent difference schemes for degenerate elliptic and parabolic equations: Hamilton-Jacobi equations and free boundary problems, SIAM J. Numer. Anal., 44, 2, 879-895 (2006) · Zbl 1124.65103
[45] Oberman, A. M., Finite difference methods for the infinity Laplace and p-Laplace equations, J. Comput. Appl. Math., 254, 65-80 (2013) · Zbl 1290.65098
[46] Ortega, J. M., The Newton-Kantorovich theorem, Am. Math. Mon., 75, 6, 658-660 (1968) · Zbl 0183.43004
[47] Peres, Y.; Schramm, O.; Sheffield, S.; Wilson, D., Tug-of-war and the infinity Laplacian, J. Am. Math. Soc., 22, 1, 167-210 (2009) · Zbl 1206.91002
[48] Saad, Y.; Gmres, M. H. Schultz, A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7, 3, 856-869 (1986) · Zbl 0599.65018
[49] Shi, Z.; Osher, S.; Zhu, W., Weighted nonlocal Laplacian on interpolation from sparse data, J. Sci. Comput., 73, 2-3, 1164-1177 (2017) · Zbl 1381.65007
[50] Slepcev, D.; Thorpe, M., Analysis of p-Laplacian regularization in semisupervised learning, SIAM J. Math. Anal., 51, 3, 2085-2120 (2019) · Zbl 1422.49020
[51] Ting, D.; Huang, L.; Jordan, M., An analysis of the convergence of graph Laplacians (2011), arXiv preprint
[52] Trillos, N. G.; Murray, R., A maximum principle argument for the uniform convergence of graph Laplacian regressors (2019), arXiv preprint
[53] Vargas, R. A.; Burrus, C. S., Iterative design of lp fir and iir digital filters, (2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop (Jan 2009)), 468-473
[54] Wang, Y.; Cheema, M. A.; Lin, X.; Zhang, Q., Multi-manifold ranking: using multiple features for better image retrieval, (Pacific-Asia Conference on Knowledge Discovery and Data Mining (2013), Springer), 449-460
[55] Xiao, H.; Rasul, K.; Vollgraf, R., Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017)
[56] Xu, B.; Bu, J.; Chen, C.; Cai, D.; He, X.; Liu, W.; Luo, J., Efficient manifold ranking for image retrieval, (Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (2011), ACM), 525-534
[57] Yang, C.; Zhang, L.; Lu, H.; Ruan, X.; Yang, M.-H., Saliency detection via graph-based manifold ranking, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013)), 3166-3173
[58] Zhou, D.; Bousquet, O.; Lal, T. N.; Weston, J.; Schölkopf, B., Learning with local and global consistency, (Advances in Neural Information Processing Systems (2004)), 321-328
[59] Zhou, D.; Huang, J.; Schölkopf, B., Learning from labeled and unlabeled data on a directed graph, (Proceedings of the 22nd International Conference on Machine Learning (2005), ACM), 1036-1043
[60] Zhou, D.; Schölkopf, B., Regularization on discrete spaces, (Joint Pattern Recognition Symposium (2005), Springer), 361-368
[61] Zhou, D.; Weston, J.; Gretton, A.; Bousquet, O.; Schölkopf, B., Ranking on data manifolds, (Advances in Neural Information Processing Systems (2004)), 169-176
[62] Zhu, X.; Ghahramani, Z.; Lafferty, J. D., Semi-supervised learning using Gaussian fields and harmonic functions, (Proceedings of the 20th International Conference on Machine Learning (ICML-03) (2003)), 912-919
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.