×

Label propagation via local geometry preserving for deep semi-supervised image recognition. (English) Zbl 1521.68152

Summary: In this paper, we propose a novel transductive pseudo-labeling based method for deep semi-supervised image recognition. Inspired from the superiority of pseudo labels inferred by label propagation compared with those inferred from network, we argue that information flow from labeled data to unlabeled data should be kept noiseless and with minimum loss. Previous research works use scarce labeled data for feature learning and solely consider the relationship between two feature vectors to construct the similarity graph in feature space, which causes two problems that ultimately lead to noisy and incomplete information flow from labeled data to unlabeled data. The first problem is that the learned feature mapping is highly likely to be biased and can easily over-fit noise. The second problem is the loss of local geometry information in feature space during label propagation. Accordingly, we firstly propose to incorporate self-supervised learning into feature learning for cleaner information flow in feature space during subsequent label propagation. Secondly, we propose to use reconstruction concept to measure pairwise similarity in feature space, such that local geometry information can be preserved. Ablation study confirms synergistic effects from features learned with self-supervision and similarity graph with local geometry preserving. Extensive experiments conducted on benchmark datasets have verified the effectiveness of our proposed method.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T07 Artificial neural networks and deep learning
68T10 Pattern recognition, speech recognition
68U10 Computing methodologies for image processing
Full Text: DOI

References:

[1] Andersen, M.; Dahl, J.; Vandenberghe, L., CVXOPT: A python package for convex optimization (version 1.2) (2020)
[2] Anon, M., (Chapelle, O.; Schölkopf, B.; Zien, A., Semi-supervised learning. Semi-supervised learning, Adaptive computation and machine learning (2006), MIT Press: MIT Press Cambridge, Mass)
[3] Athiwaratkun, B.; Finzi, M.; Izmailov, P.; Wilson, A. G., There are many consistent explanations of unlabeled data: why you should average (2019), ArXiv:1806.05594 [Cs, Stat]. arXiv:1806.05594
[4] Belkin, M.; Niyogi, P.; Sindhwani, V., Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, 7, 2399-2434 (2006) · Zbl 1222.68144
[5] Berthelot, D.; Carlini, N.; Goodfellow, I.; Papernot, N.; Oliver, A.; Raffel, C., Mixmatch: a holistic approach to semi-supervised learning (2019), ArXiv:1905.02249 [Cs, Stat]. arXiv:1905.02249
[6] Cubuk, E. D.; Zoph, B.; Mane, D.; Vasudevan, V.; Le, Q. V., Autoaugment: learning augmentation policies from data (2019), ArXiv:1805.09501 [Cs, Stat]. arXiv:1805.09501
[7] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: a large-scale hierarchical image database. In CVPR09.
[8] Gidaris, S.; Singh, P.; Komodakis, N., Unsupervised representation learning by predicting image rotations (2018), ArXiv:1803.07728 [Cs]. arXiv:1803.07728
[9] Goodfellow, I. J.; Shlens, J.; Szegedy, C., Explaining and harnessing adversarial examples (2015), ArXiv:1412.6572 [Cs, Stat]. arXiv:1412.6572
[10] Grandvalet, Y.; Bengio, Y., Semi-supervised learning by entropy minimization, (Saul, L. K.; Weiss, Y.; Bottou, L., Advances in neural information processing systems 17 (2005), MIT Press), 529-536
[11] He, K.; Zhang, X.; Ren, S.; Sun, J., Deep residual learning for image recognition (2015), ArXiv:1512.03385 [Cs]. arXiv:1512.03385
[12] Iscen, A.; Tolias, G.; Avrithis, Y.; Chum, O., Label propagation for deep semi-supervised learning, (Proceedings of the IEEE conference on computer vision and pattern recognition (2019)), 5070-5079
[13] Keskar, N. S.; Socher, R., Improving generalization performance by switching from adam to sgd (2017), ArXiv Preprint ArXiv:1712.07628
[14] Kingma, D. P.; Ba, J., Adam: A method for stochastic optimization (2014), ArXiv Preprint ArXiv:1412.6980
[15] Krizhevsky, A., Learning multiple layers of features from tiny imagesTech. rep (2009)
[16] Laine, S.; Aila, T., Temporal ensembling for semi-supervised learning (2017), ArXiv:1610.02242 [Cs]. arXiv:1610.02242
[17] Larsson, G.; Maire, M.; Shakhnarovich, G., Colorization as a proxy task for visual understanding (2017), ArXiv:1703.04044 [Cs]. arXiv:1703.04044
[18] Lee, D.-H. (2013). Pseudo-label : The simple and efficient semi-supervised learning method for deep neural networks (p. 6).
[19] Liu, B.; Wu, Z.; Hu, H.; Lin, S., Deep metric transfer for label propagation with limited annotated data (2018)
[20] Loshchilov, I.; Hutter, F., SGDR: stochastic gradient descent with warm restarts (2017), ArXiv:1608.03983 [Cs, Math]. arXiv:1608.03983
[21] Luo, Y., Zhu, J., Li, M., Ren, Y., & Zhang, B. (2018). Smooth neighbors on teacher graphs for semi-supervised learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8896-8905).
[22] Miyato, T.; Maeda, S.-i.; Koyama, M.; Ishii, S., Virtual adversarial training: a regularization method for supervised and semi-supervised learning (2018), ArXiv:1704.03976 [Cs, Stat]. arXiv:1704.03976
[23] Noroozi, M.; Favaro, P., Unsupervised learning of visual representations by solving jigsaw puzzles (2017), arXiv:1603.09246
[24] Park, S.; Park, J.-K.; Shin, S.-J.; Moon, I.-C., Adversarial dropout for supervised and semi-supervised learning (2017), arXiv:1707.03631 [cs]
[25] Saul, L. K., Labs, T., Ave, P., Park, F., & Roweis, S. T. 2000. An introduction to locally linear embedding (p. 13).
[26] Shi, W.; Gong, Y.; Ding, C.; Ma, Z.; Tao, X.; Zheng, N., Transductive semi-supervised deep learning using min-max features, (Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y., Computer vision (vol. 11209) (2018), Springer International Publishing: Springer International Publishing Cham), 311-327
[27] Tarvainen, A.; Valpola, H., Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results (2018), ArXiv:1703.01780 [Cs, Stat]. arXiv:1703.01780
[28] Verma, V.; Lamb, A.; Kannala, J.; Bengio, Y.; Lopez-Paz, D., Interpolation consistency training for semi-supervised learning (2019), ArXiv:1903.03825 [Cs, Stat]arXiv:1903.03825
[29] Vinyals, O.; Blundell, C.; Lillicrap, T.; Kavukcuoglu, K.; Wierstra, D., Matching networks for one shot learning (2017), ArXiv:1606.04080 [Cs, Stat]. arXiv:1606.04080
[30] Wang, S., Zeng, Y., Liu, X., Zhu, E., Yin, J., & Xu, C., et al. (2019). Effective end-to-end unsupervised outlier detection via inlier priority of discriminative network. In NeurIPS (pp. 5960-5973).
[31] Wang, Fei.; Zhang, Changshui., Label propagation through linear neighborhoods, IEEE Transactions on Knowledge and Data Engineering, 20, 1, 55-67 (2008)
[32] Wilson, A. C.; Roelofs, R.; Stern, M.; Srebro, N.; Recht, B., The marginal value of adaptive gradient methods in machine learning (2017), ArXiv Preprint ArXiv:1705.08292
[33] Xie, Q.; Dai, Z.; Hovy, E.; Luong, M.-T.; Le, Q. V., Unsupervised data augmentation for consistency training (2019), ArXiv:1904.12848 [Cs, Stat]. arXiv:1904.12848
[34] Zhou, D.; Bousquet, O.; Lal, T. N.; Weston, J.; Schölkopf, B., Learning with local and global consistency, (Thrun, S.; Saul, L. K.; Schölkopf, B., Advances in neural information processing systems 16 (2004), MIT Press), 321-328
[35] Zhu, X.; Ghahramani, Z., Learning from labeled and unlabeled data with label propagationTech. rep (2002)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.