×

Fuzzy clustering in parallel universes. (English) Zbl 1124.68101

Summary: We present an extension of the fuzzy \(c\)-means algorithm, which operates simultaneously on different feature spaces – so-called parallel universes – and also incorporates noise detection. The method assigns membership values of patterns to different universes, which are then adopted throughout the training. This leads to better clustering results since patterns not contributing to clustering in a universe are (completely or partially) ignored. The method also uses an auxiliary universe to capture patterns that do not contribute to any of the clusters in the real universes and therefore are likely to represent noise. The outcome of the algorithm is clusters distributed over different parallel universes, each modeling a particular, potentially overlapping subset of the data and a set of patterns detected as noise. One potential target application of the proposed method is biological data analysis where different descriptors for molecules are available but none of them by itself shows global satisfactory prediction results.

MSC:

68T10 Pattern recognition, speech recognition
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

COSA

References:

[1] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms (1981), Plenum Press: Plenum Press New York · Zbl 0503.68069
[2] James C. Bezdek, Richard J. Hathaway, VAT: a tool for visual assessment of (cluster) tendency, in: Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN ’02), 2002, pp. 2225-2230.; James C. Bezdek, Richard J. Hathaway, VAT: a tool for visual assessment of (cluster) tendency, in: Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN ’02), 2002, pp. 2225-2230.
[3] Steffen Bickel, Tobias Scheffer, Multi-view clustering, in: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), 2004, pp. 19-26.; Steffen Bickel, Tobias Scheffer, Multi-view clustering, in: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM’04), 2004, pp. 19-26.
[4] Blum, Avrim; Mitchell, Tom, Combining labeled and unlabeled data with co-training, (Proceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT’98) (1998), ACM Press), 92-100
[5] Bustos, Benjamin; Keim, Daniel A.; Saupe, Dietmar; Schreck, Tobias; Vranić, Dejan V., An experimental effectiveness comparison of methods for 3D similarity search, International Journal on Digital Libraries (Special issue on Multimedia Contents and Management in Digital Libraries), 6, 1, 39-54 (2006)
[6] Cruciani, G.; Crivori, P.; Carrupt, P.-A.; Testa, B., Molecular fields in quantitative structure-permeation relationships: the VolSurf approach, Journal of Molecular Structure, 503, 17-30 (2000)
[7] Davé, Rajesh N., Characterization and detection of noise in clustering, Pattern Recognition Letters, 12, 657-664 (1991)
[8] Friedman, Jerome H.; Meulman, Jacqueline J., Clustering objects on subsets of attributes, Journal of the Royal Statistical Society, 66, 4 (2004) · Zbl 1060.62064
[9] Hand, David J.; Mannila, Heikki; Smyth, Padhraic, Principles of Data Mining (2001), MIT Press
[10] Höppner, Frank; Klawoon, Frank; Kruse, Rudolf; Runkler, Thomas, Fuzzy Cluster Analysis (1999), John Wiley: John Wiley Chichester, England · Zbl 0944.65009
[11] Karin Kailing, Hans-Peter Kriegel, Alexey Pryakhin, Matthias Schubert, Clustering multi-represented objects with noise, in: PAKDD, 2004, pp. 394-403.; Karin Kailing, Hans-Peter Kriegel, Alexey Pryakhin, Matthias Schubert, Clustering multi-represented objects with noise, in: PAKDD, 2004, pp. 394-403.
[12] Liu, Huan; Motoda, Hiroshi, Feature Selection for Knowledge Discovery & Data Mining (1998), Kluwer Academic Publishers · Zbl 0908.68127
[13] Parsons, Lance; Haque, Ehtesham; Liu, Huan, Subspace clustering for high dimensional data: a review, SIGKDD Explorations, Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining, 6, 1, 90-105 (2004)
[14] Patterson, David E.; Berthold, Michael R., Clustering in parallel universes, (Proceedings of the 2001 IEEE Conference in Systems, Man and Cybernetics (2001), IEEE Press) · Zbl 1124.68101
[15] Pedrycz, Witold, Collaborative fuzzy clustering, Pattern Recognition Letters, 23, 14, 1675-1686 (2002) · Zbl 1010.68136
[16] Schuffenhauer, Ansgar; Gillet, Valerie J.; Willett, Peter, Similarity searching in files of three-dimensional chemical structures: analysis of the bioster database using two-dimensional fingerprints and molecular field descriptors, Journal of Chemical Information and Computer Sciences, 40, 2, 295-307 (2000)
[17] Venkateswarlu, N. B.; Raju, P. S.V. S.K., Fast ISODATA clustering algorithms, Pattern Recognition, 25, 3, 335-342 (1992)
[18] Jidong Wang, Hua-Jun Zeng, Zheng Chen, Hongjun Lu, Li Tao, Wei-Ying Ma, ReCoM: reinforcement clustering of multi-type interrelated data objects, in: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’03), 2003, pp. 274-281.; Jidong Wang, Hua-Jun Zeng, Zheng Chen, Hongjun Lu, Li Tao, Wei-Ying Ma, ReCoM: reinforcement clustering of multi-type interrelated data objects, in: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’03), 2003, pp. 274-281.
[19] Yager, R. R.; Filev, D. P., Approximate clustering via the mountain method, IEEE Transactions on Systems, Man and Cybernetics, 24, 8, 1279-1284 (1994)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.