×

An incremental learning algorithm based on the \( K\)-associated graph for non-stationary data classification. (English) Zbl 1320.68132

Summary: Non-stationary classification problems concern the changes on data distribution over a classifier lifetime. To face this problem, learning algorithms must conciliate essential, but difficult to gather, attributes like good classification performance, stability and low associated costs, like processing time and memory. This paper presents an extension of the \( K\)-associated optimal graph learning algorithm to cope with classification over non-stationary domains. The algorithm relies on a graph structure consisting of many disconnected components (subgraphs). Such graph enhances data representation by fitting locally groups of data according to a purity measure, which, in turn, quantifies the overlapping between vertices of different classes. As a result, the graph can be used to accurately estimate the probability of unlabeled data to belong to a given class. The proposed algorithm is benefited from the dynamical evolution of the graph by updating its set of components when new data is presented along time, by removing old components as new components arise. Experimental results on artificial and real domains and further statistical analysis show that the proposed algorithm is an effective solution to non-stationary classification problems.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[2] Belkin, M.; Niyogi, P.; Sindhwani, V., Manifold regularization: a geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, 1, 1-48 (2006) · Zbl 1222.68144
[3] Bertini, J. R.; Lopes, A. A.; Zhao, L., Partially labeled data stream classification with the semi-supervised k-associated graph, Journal of the Brazilian Computer Society, 18, 299-310 (2012)
[4] Bertini, J. R.; Zhao, L.; Motta, R.; Lopes, A. A., A nonparametric classification method based on k-associated graphs, Information Sciences, 181, 5435-5456 (2011)
[5] (Chapelle, O.; Zien, A.; Schölkopf, B., Semi-Supervised Learning (2006), MIT Press)
[6] Chariatis, A., Very fast online learning of highly non linear problems, Journal of Machine Learning Research, 8, 2017-2045 (2007) · Zbl 1222.68165
[7] Chen, J.; ren Fang, H.; Saad, Y., Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection, Journal of Machine Learning Research, 10, 1989-2012 (2009) · Zbl 1235.68137
[8] Dems̆ar, J., Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1-30 (2006) · Zbl 1222.68184
[9] Deng, X.; Wang, X., Incremental learning of dynamic fuzzy neural networks for accurate system modeling, Fuzzy Sets and Systems, 160, 972-987 (2009) · Zbl 1182.68164
[10] Duan, H.; Shao, X.; He, W. H. aand G.; Zeng, Q., An incremental learning algorithm for lagrangian support vector machines, Pattern Recognition Letters, 30, 1384-1391 (2009)
[11] Duda, R.; Hart, P.; Stork, D., Pattern Classification (2001), John Wiley & Sons, Inc. · Zbl 0968.68140
[12] Fdez-Riverola, F.; Iglesias, E. L.; Díaz, F.; Méndez, J.; Corchado, J., Applying lazy learning algorithms to tackle concept drift in spam filtering, Expert Systems with Applications, 33, 36-48 (2007)
[13] Fern, A.; Givan, R., Online ensemble learning: an empirical study, Machine Learning, 53, 71-109 (2003) · Zbl 1076.68548
[14] Freidman, J.; Bentley, J.; Finkel, R., An algorithm for finding best matches in logarithmic expected time, ACM Transactions on Mathematical Software, 3, 209-226 (1977) · Zbl 0364.68037
[15] Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P., Learning with drift detection, (Proceedings of the Brazilian Symposium on Artificial Intelligence, vol. 3171 (2004), Springer), 286-295 · Zbl 1105.68376
[16] Giraud-Carrier, C., A note on the utility of incremental learning, AI Communications, 13, 215-223 (2000) · Zbl 0967.68087
[17] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements of Statistical Learning: Data Mining, Inference and Prediction (2009), Springer-Verlag: Springer-Verlag Canada · Zbl 1273.62005
[19] Hodge, V.; Austin, J., A survey of outlier detection methodologies, Artificial Intelligence Review, 22, 85-126 (2004) · Zbl 1101.68023
[20] Hulten, G.; Spencer, L.; Domingos, P., Mining time-changing data streams, (Proceedings of the International Conference on Knowledge Discovery and Data Mining (2001), ACM Press), 97-106
[21] Kolter, J. Z.; Maloof, M. A., Dynamic weighted majority: an ensemble method for drifting concepts, Journal of Machine Learning Research, 8, 2755-2790 (2007) · Zbl 1222.68237
[23] Last, M., Online classification of nonstationary data streams, Intelligent Data Analysis, 6, 129-147 (2002) · Zbl 1088.68728
[24] Lughofer, E.; Angelov, P., Handling drifts and shifts in on-line data streams with evolving fuzzy systems, Applied Soft Computing, 11, 2057-2068 (2011)
[25] Maloof, M.; Michalski, R., Selecting examples for partial memory learning, Machine Learning, 41, 27-52 (2000)
[26] Maloof, M.; Michalski, R., Incremental learning with partial instance memory, Artificial Intelligence, 154, 95-126 (2004) · Zbl 1085.68641
[27] Minku, L.; White, A.; Yao, X., The impact of diversity on online ensemble learning in the presence of concept drift, IEEE Transactions on Knowledge and Data Engineering, 22, 730-742 (2010)
[28] Murata, N.; Kawanabe, M.; Ziehe, A.; Müller, K. R.; Amari, S., On-line learning in changing environments with applications in supervised and unsupervised learning, Neural Networks, 15, 743-760 (2002)
[30] Núnez, M.; Fidalgo, R.; Morales, R., Learning in environments with unknown dynamics: towards more robust concept learners, Journal of Machine Learning Research, 8, 2595-2628 (2007) · Zbl 1222.68276
[31] Polikar, R.; Udpa, L.; Udpa, S.; Honavar, V., An incremental learning algorithm with confidence estimation for automated identification of NDE signals, IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 51, 990-1001 (2004)
[32] Quinlan, J. R., C45 Programs for Machine Learning (1993), Morgan Kaufman Publishers
[33] Schaeffer, S., Graph clustering, Computer Science Review, 1, 27-34 (2007) · Zbl 1302.68237
[34] Schlimmer, J.; Granger, R., Beyond incremental processing: tracking concept drift, (Proceedings of the National Conference on Artificial Intelligence (1986), AAAI Press), 502-507
[35] Skočaj, D.; Leonardis, A., Incremental and robust learning of subspace representations, Image and Vision Computing, 26, 27-38 (2008)
[36] Street, N.; Kim, Y., A streaming ensemble algorithm (sea) for large-scale classification, (Proc. Int’l Conf. Knowledge Discovery and Data Mining (KDD’01) (2001), ACM: ACM NY), 377-382
[37] Sung, J.; Kim, D., Adaptive active appearance model with incremental learning, Pattern Recognition Letters, 30, 359-367 (2009)
[39] Tsymbal, A.; Pechenizkiy, M.; Cunningham, P.; Puuronen, S., Dynamic integration of classifiers for handling concept drift, Information Fusion, 9, 56-68 (2008)
[40] Utgoff, P.; Berkman, N.; Clouse, J., Decision tree induction based on efficient tree restructuring, Machine Learning, 29, 5-44 (1997) · Zbl 0886.68105
[42] Widmer, G.; Kubat, M., Learning in the presence of concept drift and hidden contexts, Machine Learning, 23, 69-101 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.