Abstract
Traditional data stream classification techniques are not capable of recognizing new classes emerged in data stream. Recently, an ensemble classification framework focused on the new challenge. But the novel class detection technique is limited to the numeric data in the framework. And, both the lower process speed and the larger model size of base classifier trouble the framework. In this paper, a novel class instance detection technique is proposed to deal with mixed attribute data and the VFDTc is adopted as base classifier to speed up the process and reduce the model size. Experimental results showed that the algorithm outperformed the previous one in both classification accuracy and processing speed.
The work described in this paper is supported by Foundation of Guangxi Key Laboratory of Trustworthy Software, China (kx201116) and Educational Commission of Guangxi Province, China(201204LX122).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Spinosa, E.J., de Leon, A.P., de Carvalho, F., Gama, J.: OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams. In: Proc. of the Annual ACM Symposium of Applied Computing, pp. 448–452. ACM, New York (2007)
Masud, M.M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: Integrating novel class detection with classification for concept-drifting data streams. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part II. LNCS (LNAI), vol. 5782, pp. 79–94. Springer, Heidelberg (2009)
Masud, M.M., Chen, Q., Khan, L.: Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans. Knowl. Data Eng. 99 (2012) (preprints), http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.109
Gama, J., Rocha, R., Medas, P.: Accurate decision trees for mining high-speed data streams. In: Proceedings of the Nineth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528. ACM, New York (2003)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM, New York (2000)
Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. In: SODA 2007, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)
Huang, Z.: Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery 2(3), 283–304 (1998)
Wang, J., Zhu, Y.: Research on the weighting exponent in fuzzy K-Prototypes algorithm. Computer Application 25(02), 348–351 (2005)
Masud, M.M., Gao, J., Khan, L., Han, J., Thuraisingham, B.M.: Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans. Knowl. Data Eng. 23(6), 859–874 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miao, Y., Qiu, L., Chen, H., Zhang, J., Wen, Y. (2013). Novel Class Detection within Classification for Data Streams. In: Guo, C., Hou, ZG., Zeng, Z. (eds) Advances in Neural Networks – ISNN 2013. ISNN 2013. Lecture Notes in Computer Science, vol 7952. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39068-5_50
Download citation
DOI: https://doi.org/10.1007/978-3-642-39068-5_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39067-8
Online ISBN: 978-3-642-39068-5
eBook Packages: Computer ScienceComputer Science (R0)