×

The fuzzy C-means algorithm with fuzzy P-mode prototypes for clustering objects having mixed features. (English) Zbl 1185.68601

Summary: Frequency-based cluster prototypes have been used to cluster categorical objects, based on the simple matching dissimilarity measure. This paper introduces a new generalization called fuzzy \(p\)-mode prototype, of frequency-based prototypes. A fuzzy \(p\)-mode cluster prototype at a categorical feature is expressed as a list of \(p\) labels that have larger frequencies than others in the cluster. This paper also presents a new generalization of the fuzzy C-means clustering algorithm for the objects of mixed features. In the general fuzzy C-means clustering algorithm, any dissimilarity measures at the categorical feature level are assumed, not like other clustering algorithms that use the simple matching dissimilarity. The convergence of the general fuzzy C-means clustering algorithm under the optimization framework is proved. It is also explained through experiments over real object sets that the size of fuzzy \(p\)-mode prototypes and the fuzzification coefficients affect clustering performance.

MSC:

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml
Full Text: DOI

References:

[1] A. Asuncion, D.J. Newman, UCI Machine Learning Repository, School of Information and Computer Sciences, University of California, Irvine, 2007.; A. Asuncion, D.J. Newman, UCI Machine Learning Repository, School of Information and Computer Sciences, University of California, Irvine, 2007.
[2] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms (1981), Plenum Press: Plenum Press New York · Zbl 0503.68069
[3] Dave, R. N., Validating fuzzy partitions obtained through C-shells clustering, Pattern Recognition Letters, 17, 613-623 (1996)
[4] Dunn, J. C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, 3, 32-57 (1973) · Zbl 0291.68033
[5] He, Z.; Deng, S.; Xu, X., Improving K-modes algorithm considering frequencies of attribute values in mode, (Computational Intelligence and Security (2005)), 157-162
[6] Huang, Z., Extensions to the k-means algorithm for clustering large data sets with categorical values, Data Mining and Knowledge Discovery, 2, 3, 283-304 (1998)
[7] Huang, Z.; Ng, M. K., A fuzzy k-modes algorithm for clustering categorical data, IEEE Transactions on Fuzzy Systems, 7, 4, 446-452 (1999)
[8] Kim, D. W.; Lee, K. H.; Lee, D., Fuzzy clustering of categorical data using fuzzy centroids, Pattern Recognition Letters, 25, 1263-1271 (2004)
[9] M. Lee, Fuzzy cluster validity index based on object proximities defined over fuzzy partition matrices, in: IEEE Internat. Conf. on Fuzzy Systems, Hong Kong, China, 2008, pp. 336-340.; M. Lee, Fuzzy cluster validity index based on object proximities defined over fuzzy partition matrices, in: IEEE Internat. Conf. on Fuzzy Systems, Hong Kong, China, 2008, pp. 336-340.
[10] M. Lee, Mapping of ordinal feature values to numerical values through fuzzy clustering, in: IEEE Internat. Conf. on Fuzzy Systems, Hong Kong, China, 2008, pp. 732-737.; M. Lee, Mapping of ordinal feature values to numerical values through fuzzy clustering, in: IEEE Internat. Conf. on Fuzzy Systems, Hong Kong, China, 2008, pp. 732-737.
[11] Lee, M., On Fuzzy Clustering Validity Indices for the Objects of Mixed Features (2008), Thompson Rivers University
[12] MacQueen, Z. B., Some methods of classification and analysis of multivariate observations, (Berkeley Symp. on Mathematical Statistics and Probability, Vol. 1 (1967), University of California Press: University of California Press Berkeley), 281-297 · Zbl 0214.46201
[13] Ng, M. K.; Li, M. J.; Huang, J. Z.; He, Z., On the impact of dissimilarity measure in k-modes clustering algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 3, 503-507 (2007)
[14] San, O. M.; Huynh, V.-N.; Nakamori, Y., An alternative extension of the k-means algorithm for clustering categorical data, International Journal of Applied Mathematics and Computer Science, 14, 2, 241-247 (2004) · Zbl 1061.62091
[15] Y. Tang, F. Sun, Z. Sun, Improved validation index for fuzzy clustering, in: American Control Conf., 2005, pp. 1120-1125.; Y. Tang, F. Sun, Z. Sun, Improved validation index for fuzzy clustering, in: American Control Conf., 2005, pp. 1120-1125.
[16] Xie, X. L.; Beni, G. A., Validity measures for fuzzy clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 3, 841-846 (1991)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.