Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

X Zhang, Z Tan, F Lu, R Yan, J Liu�- Knowledge and Information Systems, 2024 - Springer
X Zhang, Z Tan, F Lu, R Yan, J Liu
Knowledge and Information Systems, 2024Springer
Semi-supervised learning is a promising approach to dealing with the problem of insufficient
labeled data. Recent methods grouped into paradigms of consistency regularization and
pseudo-labeling have outstanding performances on image data, but achieve limited
improvements when employed for processing textual information, due to the neglect of the
discrete nature of textual information and the lack of high-quality text augmentation
transformation means. In this paper, we propose the novel SeqMatch method. It can�…
Abstract
Semi-supervised learning is a promising approach to dealing with the problem of insufficient labeled data. Recent methods grouped into paradigms of consistency regularization and pseudo-labeling have outstanding performances on image data, but achieve limited improvements when employed for processing textual information, due to the neglect of the discrete nature of textual information and the lack of high-quality text augmentation transformation means. In this paper, we propose the novel SeqMatch method. It can automatically perceive abnormal model states caused by anomalous data obtained by text augmentations and reduce their interferences and instead leverages normal ones to improve the effectiveness of consistency regularization. And it generates hard artificial pseudo-labels to enable the model to be efficiently updated and optimized toward low entropy. We also design several much stronger well-organized text augmentation transformation pipelines to increase the divergence between two views of unlabeled discrete textual sequences, thus enabling the model to learn more knowledge from the alignment. Extensive comparative experimental results show that our SeqMatch outperforms previous methods on three widely used benchmarks significantly. In particular, SeqMatch can achieve a maximum performance improvement of 16.4% compared to purely supervised training when provided with a minimal number of labeled examples.
Springer
Showing the best result for this search. See all results