Three-way decision-based tri-training with entropy minimization

L Pan, C Gao, J Zhou�- Information Sciences, 2022 - Elsevier
Information Sciences, 2022Elsevier
The three-way decision (TWD) theory is an effective methodology and philosophy for
thinking in three and has been successfully applied to knowledge reasoning and decision
making. However, limited research has been devoted to learning from partially labeled data
with discrete and continuous attributes using TWD. In this study, we propose a TWD-based
tri-training model for partially labeled data with heterogeneous attributes. First, a measure of
semi-supervised neighborhood mutual information is defined, based on which a heuristic�…
Abstract
The three-way decision (TWD) theory is an effective methodology and philosophy for thinking in three and has been successfully applied to knowledge reasoning and decision making. However, limited research has been devoted to learning from partially labeled data with discrete and continuous attributes using TWD. In this study, we propose a TWD-based tri-training model for partially labeled data with heterogeneous attributes. First, a measure of semi-supervised neighborhood mutual information is defined, based on which a heuristic algorithm is developed to generate an optimal semi-supervised reduct of partially labeled data. Then, a tri-training model is trained on the original view along with two views transformed by data discretization and principal component analysis, and the strategy of TWD with entropy minimization is further introduced to classify unlabeled data into useful, uncertain, and useless samples, whereas the multiview tri-training model is iteratively retrained on only a certain number of useful samples with low entropy to improve the performance. Finally, the effectiveness of the proposed model is theoretically analyzed from the perspective of noise learning. The experimental results of semi-supervised attribute reduction and semi-supervised classification on UCI datasets show that our method is effective in handling partially labeled data and outperforms supervised models trained on all data with full supervision.
Elsevier
Showing the best result for this search. See all results