×

Quantity optimization of virtual sample generation with two kinds of upper bound conditions. (Chinese. English summary) Zbl 1438.68096

Summary: With small sample data sets, the virtual sample generation technology has been proved to be able to effectively improve the performance of machine learning algorithm. However, there is no definite conclusion for the optimal generation number. First of all, under the condition of the limit of standard variance of a given training sample, the information entropy theory is proposed to study the number of optimal virtual sample generation. In addition, the noise generated by virtual sample generation is taken into account and a general probability model and the analysis method of the number of optimal virtual samples are established at a given confidence level (0.95). A small sample data set is set up based on the historical monitoring fault data of a substation in Huzhou, Zhejiang, in 2016 and a four virtual sample generation experiment is designed. The results show that the two optimal virtual sample generation rules are effective, and the accuracy of the corresponding machine learning prediction is obviously improved.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI