×

Poisson reweighted Laplacian uncertainty sampling for graph-based active learning. (English) Zbl 1533.68289

Summary: We show that uncertainty sampling is sufficient to achieve exploration versus exploitation in graph-based active learning, as long as the measure of uncertainty properly aligns with the underlying model and the model properly reflects uncertainty in unexplored regions. In particular, we use a recently developed algorithm, Poisson ReWeighted Laplace Learning (PWLL), for the classifier and we introduce an acquisition function designed to measure uncertainty in this graph-based classifier that identifies unexplored regions of the data. We introduce a diagonal perturbation in PWLL which produces exponential localization of solutions, and controls the exploration versus exploitation tradeoff in active learning. We use the well-posed continuum limit of PWLL to rigorously analyze our method and present experimental results on a number of graph-based image classification problems.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
35Q62 PDEs in connection with statistics
35Q68 PDEs in connection with computer science
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68U10 Computing methodologies for image processing

References:

[1] Balcan, M.-F., Beygelzimer, A., and Langford, J., Agnostic active learning, J. Comput. System Sci., 75 (2009), pp. 78-89, doi:10.1016/j.jcss.2008.07.003. · Zbl 1162.68516
[2] Balcan, M.-F., Broder, A., and Zhang, T., Margin based active learning, in International Conference on Computational Learning Theory, , , Springer, Berlin, Heidelberg, 2007, pp. 35-50, doi:10.1007/978-3-540-72927-3_5. · Zbl 1203.68136
[3] Belkin, M., Matveeva, I., and Niyogi, P., Regularization and semi-supervised learning on large graphs, in Learning Theory: Proceedings of the 17th Annual Conference on Learning Theory, COLT 2004, Banff, Canada, , Springer, Berlin, Heidelberg, 2004, pp. 624-638. · Zbl 1078.68685
[4] Belkin, M. and Niyogi, P., Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., 15 (2003), pp. 1373-1396. · Zbl 1085.68119
[5] Belkin, M. and Niyogi, P., Semi-supervised learning on Riemannian manifolds, Mach. Learn., 56 (2004), pp. 209-239. · Zbl 1089.68086
[6] Bengio, Y., Delalleau, O., and Le Roux, N., Label propagation and quadratic criterion, in Semi-supervised Learning, MIT Press, Cambridge, MA, 2006, pp. 193-216, https://www.microsoft.com/en-us/research/publication/label-propagation-and-quadratic-criterion/.
[7] Bertozzi, A. L. and Flenner, A., Diffuse interface models on graphs for classification of high dimensional data, SIAM Rev., 58 (2016), pp. 293-328, doi:10.1137/16M1070426. · Zbl 1339.68287
[8] Bertozzi, A. L. and Merkurjev, E., Graph-based optimization approaches for machine learning, uncertainty quantification and networks, in Processing, Analyzing and Learning of Images, Shapes, and Forms, , , Elsevier/North-Holland, Amsterdam, 2019, pp. 503-531. · Zbl 1446.65032
[9] Cai, H., Zheng, V. W., and Chang, K. C.-C., Active Learning for Graph Embedding, preprint, https://arxiv.org/abs/1705.05085, 2017.
[10] Calder, J., The game theoretic p-Laplacian and semi-supervised learning with few labels, Nonlinearity, 32 (2019), pp. 301-330. · Zbl 1408.35048
[11] Calder, J., Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data, SIAM J. Math. Data Sci., 1 (2019), pp. 780-812, doi:10.1137/18m1199241. · Zbl 1499.35598
[12] Calder, J., GraphLearning Python Package, doi:10.5281/zenodo.5850940, 2022.
[13] Calder, J., Cook, B., Thorpe, M., and Slepčev, D., Poisson learning: Graph-based semi-supervised learning at very low label rates, in Proceedings of the 37th International Conference on Machine Learning, , 2020, pp. 1306-1316.
[14] Calder, J., Cook, B., Thorpe, M., Slepčev, D., Zhang, Y., and Ke, S., Graph-Based Semi-supervised Learning with Poisson Equations, manuscript, 2022.
[15] Calder, J. and García Trillos, N., Improved spectral convergence rates for graph Laplacians on \(\varepsilon \)-graphs and k-NN graphs, Appl. Comput. Harmon. Anal., 60 (2022), pp. 123-175, doi:10.1016/j.acha.2022.02.004. · Zbl 1536.05286
[16] Calder, J. and Slepčev, D., Properly-weighted graph Laplacian for semi-supervised learning, Appl. Math. Optim., 82 (2020), pp. 1111-1159, doi:10.1007/s00245-019-09637-3. · Zbl 1465.35152
[17] Calder, J., Slepčev, D., and Thorpe, M., Rates of convergence for Laplacian semi-supervised learning with low labeling rates, Research in the Mathematical Sciences, 10 (2023), p. 10. · Zbl 1538.68019
[18] Cloninger, A. and Mhaskar, H. N., Cautious active clustering, Appl. Comput. Harmon. Anal., 54 (2021), pp. 44-74, doi:10.1016/j.acha.2021.02.002. · Zbl 1467.62114
[19] Cohen, G., Afshar, S., Tapson, J., and van Schaik, A., EMNIST: An Extension of MNIST to Handwritten Letters, preprint, http://arxiv.org/abs/1702.05373, 2017.
[20] Coifman, R. R. and Lafon, S., Diffusion maps, Appl. Comput. Harmon. Anal., 21 (2006), pp. 5-30. · Zbl 1095.68094
[21] Dasarathy, G., Nowak, R., and Zhu, X., S2: An efficient graph based active learning algorithm with application to nonparametric classification, in Proceedings of the 28th Conference on Learning Theory, , , Grünwald, P., Hazan, E., and Kale, S., eds., Paris, France, 2015, pp. 503-522.
[22] Dasgupta, S., Coarse sample complexity bounds for active learning, in Advances in Neural Information Processing Systems, Vol. 18, MIT Press, Cambridge, MA, 2006, pp. 235-242.
[23] Dasgupta, S., Two faces of active learning, Theor. Comput. Sci., 412 (2011), pp. 1767-1781, doi:10.1016/j.tcs.2010.12.054. · Zbl 1209.68408
[24] Dasgupta, S. and Hsu, D., Hierarchical sampling for active learning, in Proceedings of the 25th International ACM Conference on Machine Learning, , Helsinki, Finland, 2008, pp. 208-215, doi:10.1145/1390156.1390183.
[25] Donoho, D. L. and Grimes, C., Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data, Proc. Natl. Acad. Sci. USA, 100 (2003), pp. 5591-5596. · Zbl 1130.62337
[26] Dua, D. and Graff, C., UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, 2017.
[27] Dunlop, M. M., Slepčev, D., Stuart, A. M., and Thorpe, M., Large data and zero noise limits of graph-based semi-supervised learning algorithms, Appl. Comput. Harmon. Anal., 49 (2020), pp. 655-697. · Zbl 1442.62768
[28] El Alaoui, A., Cheng, X., Ramdas, A., Wainwright, M. J., and Jordan, M. I., Asymptotic behavior of \(\ell_p\)-based Laplacian regularization in semi-supervised learning, in Proceedings of the 29th Annual Conference on Learning Theory,, 2016, pp. 879-906.
[29] Flores, M., Calder, J., and Lerman, G., Analysis and algorithms for \(\ell_p\)-based semi-supervised learning on graphs, Appl. Comput. Harmon. Anal., 60 (2022), pp. 77-122. · Zbl 07557810
[30] Gal, Y., Islam, R., and Ghahramani, Z., Deep Bayesian active learning with image data, in Proceedings of the 34th International Conference on Machine Learning, , Sydney, NSW, Australia, 2017, pp. 1183-1192.
[31] Gao, L., Yang, H., Zhou, C., Wu, J., Pan, S., and Hu, Y., Active discriminative network representation learning, in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, International Joint Conferences on Artificial Intelligence Organization, , 2018, pp. 2142-2148, doi:10.24963/ijcai.2018/296.
[32] García Trillos, N., Gerlach, M., Hein, M., and Slepčev, D., Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace-Beltrami operator, Found. Comput. Math., 20 (2020), pp. 827-887. · Zbl 1447.62141
[33] Ham, J., Lee, D., and Saul, L., Semisupervised alignment of manifolds, in Proceedings of the International Workshop on Artificial Intelligence and Statistics, , 2005, pp. 120-127.
[34] Hanneke, S., A bound on the label complexity of agnostic active learning, in Proceedings of the 24th International ACM Conference on Machine Learning, , New York, 2007, pp. 353-360, doi:10.1145/1273496.1273541.
[35] Hanneke, S., Theory of disagreement-based active learning, Found. Trends Mach. Learn., 7 (2014), pp. 131-309, doi:10.1561/2200000037. · Zbl 1327.68193
[36] Hanneke, S. and Yang, L., Minimax analysis of active learning, J. Mach. Learn. Res., 16 (2015), pp. 3487-3602, http://jmlr.org/papers/v16/hanneke15a.html. · Zbl 1351.68210
[37] He, Y., Liang, W., Zhao, D., Zhou, H.-Y., Ge, W., Yu, Y., and Zhang, W., Attribute surrogates learning and spectral tokens pooling in transformers for few-shot learning, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), , 2022, pp. 9109-9119, doi:10.1109/CVPR52688.2022.00891.
[38] Hein, M., Audibert, J.-Y., and von. Luxburg, U., Graph Laplacians and their convergence on random neighborhood graphs, J. Mach. Learn. Res., 8 (2007), pp. 1325-1368. · Zbl 1222.68215
[39] Hein, M., Audibert, J.-Y., and von Luxburg, U., From graphs to manifolds – weak and strong pointwise consistency of graph Laplacians, in Learning Theory, Auer, P. and Meir, R., eds., Springer, Berlin, Heidelberg,2005, pp. 470-485. · Zbl 1095.68097
[40] Hu, S., Xiong, Z., Qu, M., Yuan, X., Côté, M.-A., Liu, Z., and Tang, J., Graph policy network for transferable active learning on graphs, in Advances in Neural Information Processing Systems, Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds., Vol. 33, Curran Associates, Red Hook, NY, 2020, pp. 10174-10185.
[41] Ji, M. and Han, J., A variance minimization criterion to active learning on graphs, in Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, , 2012, pp. 556-564.
[42] Jiang, H. and Gupta, M., Minimum-Margin Active Learning, preprint, https://arxiv.org/abs/1906.00025, 2019.
[43] Jun, K.-S. and Nowak, R., Graph-based active learning: A new look at expected error minimization, in Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP), , 2016, pp. 1325-1329, doi:10.1109/GlobalSIP.2016.7906056.
[44] Karzand, M. and Nowak, R. D., Maximin active learning in overparameterized model classes, IEEE Trans. Inform. Theory, 1 (2020), pp. 167-177, doi:10.1109/JSAIT.2020.2991518.
[45] Kingma, D. P. and Welling, M., Auto-Encoding Variational Bayes, preprint, https://arxiv.org/abs/1312.6114, 2013.
[46] Kingma, D. P. and Welling, M., An introduction to variational autoencoders, Found. Trends Mach. Learn., 12 (2019), pp. 307-392, doi:10.1561/2200000056. · Zbl 1431.68002
[47] Kushnir, D. and Venturi, L., Diffusion-Based Deep Active Learning, preprint, https://arxiv.org/abs/2003.10339, 2020.
[48] LeCun, Y. and Cortes, C., MNIST Handwritten Digit Database, http://yann.lecun.com/exdb/mnist/, 2010.
[49] Lee, W.-Y., Hsieh, L.-C., Wu, G.-L., and Hsu, W., Graph-based semi-supervised learning with multi-modality propagation for large-scale image datasets, J. Vis. Commun. Image Represent., 24 (2013), pp. 295-302.
[50] Ma, Y., Garnett, R., and Schneider, J., \( \Sigma \)-optimality for active learning on Gaussian random fields, in Advances in Neural Information Processing Systems, Vol. 26, Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K. Q., eds., Curran Associates, Red Hook, NY, 2013, pp. 2751-2759.
[51] Miller, K. and Bertozzi, A. L.,Model-Change active learning in graph-based semi-supervised learning, To appear in Communications on Applied Mathematics and Computation (CAMC), Springer Nature, 2023.
[52] Miller, K., Li, H., and Bertozzi, A. L., Efficient graph-based active learning with probit likelihood via Gaussian approximations, in ICML Workshop on Experimental Design and Active Learning, 2020, http://arxiv.org/abs/2007.11126.
[53] Miller, K., Mauro, J., Setiadi, J., Baca, X., Shi, Z., Calder, J., and Bertozzi, A., Graph-based active learning for semi-supervised classification of SAR data, in Proceedings of the Society of Photo-Optical Instrumentation Engineers (SPIE) Conference on Defense + Commercial Sensing, , 2022.
[54] Mirzasoleiman, B., Big Data Summarization Using Submodular Functions, Ph.D. thesis, ETH Zurich, Zurich, Switzerland, 2017.
[55] Murphy, J. M. and Maggioni, M., Unsupervised clustering and active learning of hyperspectral images with nonlinear diffusion, IEEE Trans. Geosci. Remote Sens., 57 (2019), pp. 1829-1845, doi:10.1109/TGRS.2018.2869723.
[56] Nadler, B., Srebro, N., and Zhou, X., Semi-supervised learning with the graph Laplacian: The limit of infinite unlabelled data, in Proceedings of the 22nd International Conference on Neural Information Processing Systems, NIPS’09, , Curran Associates, Red Hook, NY, 2009, pp. 1330-1338.
[57] Qiao, Y.-L., Shi, C. X., Wang, C., Li, H., Haberland, M., Luo, X., Stuart, A. M., and Bertozzi, A. L., Uncertainty quantification for semi-supervised multi-class classification in image processing and ego-motion analysis of body-worn videos, Image Process. Algorithms Syst., (2019), doi:10.2352/issn.2470-1173.2019.11.ipas-264.
[58] Sellars, P., Aviles-Rivero, A. I., and Schönlieb, C.-B., LaplaceNet: A hybrid graph-energy neural network for deep semisupervised classification, IEEE Trans. Neural Netw. Learn. Syst., (2022), pp. 1-13, doi:10.1109/TNNLS.2022.3203315.
[59] Sener, O. and Savarese, S., Active Learning for Convolutional Neural Networks: A Core-Set Approach, preprint, http://arxiv.org/abs/1708.00489, 2018.
[60] Settles, B., Active Learning, , Morgan & Claypool Publishers LLC, San Rafael, CA, 2012, doi:10.2200/s00429ed1v01y201207aim018. · Zbl 1270.68006
[61] Shi, Z., Osher, S., and Zhu, W., Weighted nonlocal Laplacian on interpolation from sparse data, J. Sci. Comput., 73 (2017), pp. 1164-1177. · Zbl 1381.65007
[62] Shui, C., Zhou, F., Gagn’e, C., and Wang, B., Deep active learning: Unified and principled method for query and training, in Proceedings of the International Conference on Artificial Intelligence and Statistics, , https://api.semanticscholar.org/CorpusID:208202390, 2020.
[63] Siméoni, O., Budnik, M., Avrithis, Y., and Gravier, G., Rethinking deep active learning: Using unlabeled data at model training, in Proceedings of the 25th International Conference on Pattern Recognition (ICPR), , 2021, doi:10.1109/ICPR48806.2021.9412716.
[64] Slepčev, D. and Thorpe, M., Analysis of p-Laplacian regularization in semisupervised learning, SIAM J. Math. Anal., 51 (2019), pp. 2085-2120, doi:10.1137/17M115222X. · Zbl 1422.49020
[65] Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C. A., Cubuk, E. D., Kurakin, A., and Li, C.-L., FixMatch: Simplifying semi-supervised learning with consistency and confidence, in Advances in Neural Information Processing Systems, Vol. 33, Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., eds., Curran Associates, Red Hook, NY, 2020, pp. 596-608.
[66] Tong, S. and Koller, D., Support vector machine active learning with applications to text classification, J. Mach. Learn. Res., 2 (2001), pp. 45-66. · Zbl 1009.68131
[67] Vahidian, S., Mirzasoleiman, B., and Cloninger, A., Coresets for estimating means and mean square error with limited greedy samples, in Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), , 2020, pp. 350-359.
[68] von Luxburg, U., A tutorial on spectral clustering, Stat. Comput., 17 (2007), pp. 395-416, doi:10.1007/s11222-007-9033-z.
[69] Wang, B., Tu, Z., and Tsotsos, J. K., Dynamic label propagation for semi-supervised multi-class multi-label classification, in Proceedings of the IEEE International Conference on Computer Vision, , 2013, pp. 425-432.
[70] Welling, M. and Kipf, T. N., Semi-supervised classification with graph convolutional networks, in Proceedings of the International Conference on Learning Representations, , 2017.
[71] Xiao, H., Rasul, K., and Vollgraf, R., Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms, preprint, http://arxiv.org/abs/1708.07747, 2017.
[72] Zhang, N., Li, L., Chen, X., Deng, S., Bi, Z., Tan, C., Huang, F., and Chen, H., Differentiable prompt makes pre-trained language models better few-shot learners, in Proceedings of the International Conference on Learning Representations, , 2022, https://openreview.net/forum?id=ek9a0qIafW.
[73] Zhang, Y., Tong, H., Xia, Y., Zhu, Y., Chi, Y., and Ying, L., Batch active learning with graph neural networks via multi-agent deep reinforcement learning, in Proceedings of the AAAI Conference on Artificial Intelligence, , Vol. 36, 2022, pp. 9118-9126, doi:10.1609/aaai.v36i8.20897.
[74] Zheng, M., You, S., Huang, L., Wang, F., Qian, C., and Xu, C., SimMatch: Semi-supervised learning with similarity matching, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), , 2022, pp. 14451-14461, doi:10.1109/CVPR52688.2022.01407.
[75] Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., and Sun, M., Graph neural networks: A review of methods and applications, AI Open, 1 (2020), pp. 57-81, doi:10.1016/j.aiopen.2021.01.001.
[76] Zhou, X. and Belkin, M., Semi-supervised learning by higher order regularization, in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, , 2011, pp. 892-900.
[77] Zhu, D., Li, Z., Wang, X., Gong, B., and Yang, T., A robust zero-sum game framework for pool-based active learning, in Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, , 2019, pp. 517-526.
[78] Zhu, X., Ghahramani, Z., and Lafferty, J., Semi-supervised learning using Gaussian fields and harmonic functions, in Proceedings of the 20th International Conference on Machine Learning, , AAAI Press, Washington, D.C., 2003, pp. 912-919.
[79] Zhu, X., Lafferty, J., and Ghahramani, Z., Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions, in Proceedings of the International Conference on Machine Learning (ICML) workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, , 2003, pp. 58-65.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.