×

Fuzzy C-means clustering algorithms with weighted membership and distance. (English) Zbl 1504.68188

Summary: Fuzzy C-means (FCM) clustering algorithm is an important and popular clustering algorithm which is utilized in various application domains such as pattern recognition, machine learning, and data mining. Although this algorithm has shown acceptable performance in diverse problems, the current literature does not have studies about how they can improve the clustering quality of partitions with overlapping classes. The better the clustering quality of a partition, the better is the interpretation of the data, which is essential to understand real problems. This work proposes two robust FCM algorithms to prevent ambiguous membership into clusters. For this, we compute two types of weights: an weight to avoid the problem of overlapping clusters; and other weight to enable the algorithm to identify clusters of different shapes. We perform a study with synthetic datasets, where each one contains classes of different shapes and different degrees of overlapping. Moreover, the study considered real application datasets. Our results indicate such weights are effective to reduce the ambiguity of membership assignments thus generating a better data interpretation.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T10 Pattern recognition, speech recognition

Software:

DIVFRP; PRMLT; UCI-ml; Scikit
Full Text: DOI

References:

[1] Anzai, Y., Pattern Recognition and Machine Learning (Elsevier, 2012). · Zbl 0756.68088
[2] Mitchell, R., Michalski, J. and Carbonell, T., An Artificial Intelligence Approach (Springer, 2013).
[3] Witten, I. H., Frank, E. and Hall, M. A., Practical machine learning tools and techniques, Morgan Kaufmann (2005) 578. · Zbl 1076.68555
[4] Han, J., Pei, J. and Kamber, M., Data Mining: Concepts and Techniques (Elsevier, 2011). · Zbl 1445.68004
[5] Padhy, N., Mishra, D. and Panigrahi, R., The survey of data mining applications and feature scope, International Journal of Computer Science, Engineering and Information Technology (IJCSEIT)2 (2012) 43.
[6] Bishop, C. M., Pattern Recognition and Machine Learning (Springer, 2006). · Zbl 1107.68072
[7] MacKay, D. J. and Mac Kay, D. J., Information Theory, Inference and Learning Algorithms (Cambridge University Press, 2003). · Zbl 1055.94001
[8] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V.et al., Scikit-learn: Machine learning in python, Journal of Machine Learning Research12 (2011) 2825-2830. · Zbl 1280.68189
[9] Blum, A. and Chawla, S., Learning from labeled and unlabeled data using graph mincuts, Proceedings of International Conference on Machine Learning (ICML-2001).
[10] Breve, F., Zhao, L., Quiles, M., Pedrycz, W. and Liu, J., Particle competition and cooperation in networks for semi-supervised learning, IEEE Transactions on Knowledge and Data Engineering24 (2011) 1686-1698.
[11] Jain, A. K., Murty, M. N. and Flynn, P. J., Data clustering: a review, ACM Computing Surveys (CSUR)31 (1999) 264-323.
[12] Kuo, R.-J., Chen, S., Cheng, W. and Tsai, C.-Y., Integration of artificial immune network and k-means for cluster analysis, Knowledge and Information Systems40 (2014) 541-557.
[13] Cavalcanti, R. B. D. C., Pimentel, B. A., de Almeida, C. W. and de Souza, R. M., A multivariate fuzzy kohonen clustering network, in 2019 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2019), pp. 1-7.
[14] Duda, R. O., Hart, P. E. and Stork, D. G., Pattern Classification (John Wiley & Sons, 2012). · Zbl 0968.68140
[15] Gan, G., Ma, C. and Wu, J., Data Clustering: Theory, Algorithms, and Applications, Vol. 20 (Siam, 2007). · Zbl 1185.68274
[16] Xu, R. and Wunsch, D. C., Survey of clustering algorithms, IEEE Transactions On Neural Networks16 (2005) 645-678.
[17] Jain, A. K., Data clustering: 50 years beyond k-means, Pattern Recognition Letters31 (2010) 651-666.
[18] Cohen-Addad, V., Kanade, V., Mallmann-Trenn, F. and Mathieu, C., Hierarchical clustering: Objective functions and algorithms, Journal of the ACM (JACM)66 (2019) 1-42. · Zbl 1473.62213
[19] Zhong, C., Miao, D., Wang, R. and Zhou, X., Divfrp: An automatic divisive hierarchical clustering method based on the furthest reference points, Pattern Recognition Letters29 (2008) 2067-2077.
[20] Pal, N. R. and Sarkar, K., What and when can we gain from the kernel versions of c-means algorithm?, IEEE Transactions on Fuzzy Systems22 (2013) 363-379.
[21] Bezdek, J. C., Pattern Recognition With Fuzzy Objective Function Algorithms (Plenum Press, New York, NY, 1981). · Zbl 0503.68069
[22] Gustafson, D. E. and Kessel, W. C., Fuzzy clustering with a fuzzy covariance matrix, in 1978 IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes (IEEE, 1979), pp. 761-766. · Zbl 0448.62045
[23] Krishnapuram, R. and Keller, J. M., A possibilistic approach to clustering, IEEE Transactions on Fuzzy Systems1 (1993) 98-110.
[24] Keller, A. and Klawonn, F., Fuzzy clustering with weighting of data variables, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems8 (2000) 735-746. · Zbl 0986.68943
[25] Zhang, D.-Q. and Chen, S.-C., Kernel-based fuzzy and possibilistic c-means clustering, in Proceedings of the International Conference Artificial Neural Network (2003), Vol. 122, pp. 122-125.
[26] de Carvalho, F. D. A., Tenório, C. P. and Junior, N. L. C., Partitional fuzzy clustering methods based on adaptive quadratic distances, Fuzzy Sets and Systems157 (2006) 2833-2857. · Zbl 1103.68679
[27] Yang, M.-S. and Lai, C.-Y., A robust automatic merging possibilistic clustering method, IEEE Transactions on Fuzzy Systems19 (2010) 26-41.
[28] Pimentel, B. A. and de Souza, R. M., A multivariate fuzzy c-means method, Applied Soft Computing13 (2013) 1592-1607.
[29] Pimentel, B. A. and de Souza, R. M., A weighted multivariate fuzzy c-means method in interval-valued scientific production data, Expert Systems with Applications41 (2014) 3223-3236.
[30] Pimentel, B. A. and de Souza, R. M., Multivariate fuzzy c-means algorithms with weighting, Neurocomputing174 (2016) 946-965.
[31] Leski, J. M., Fuzzy c-ordered-means clustering, Fuzzy Sets and Systems286 (2016) 114-133. · Zbl 06840609
[32] Ding, Y. and Fu, X., Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm, Neurocomputing188 (2016) 233-238.
[33] Kuo, R., Lin, T., Zulvia, F. E. and Tsai, C., A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis, Applied Soft Computing67 (2018) 299-308.
[34] Pimentel, B. A. and de Souza, R. M., A generalized multivariate approach for possibilistic fuzzy c-means clustering, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems26 (2018) 893-916. · Zbl 1470.62079
[35] Stetco, A., Zeng, X.-J. and Keane, J., Fuzzy c-means++: Fuzzy c-means with effective seeding initialization, Expert Systems with Applications42 (2015) 7541-7548.
[36] Lei, T., Jia, X., Zhang, Y., He, L., Meng, H. and Nandi, A. K., Significantly fast and robust fuzzy c-means clustering algorithm based on morphological reconstruction and membership filtering, IEEE Transactions on Fuzzy Systems26 (2018) 3027-3041.
[37] Liu, X.-Y., Fan, J.-C. and Chen, Z.-W., Improved fuzzy c-means algorithm based on density peak, International Journal of Machine Learning and Cybernetics (2019) 1-8.
[38] Zhu, X., Zhang, S., Zhu, Y., Zheng, W. and Yang, Y., Self-weighted multi-view fuzzy clustering, ACM Transactions on Knowledge Discovery from Data (TKDD)14 (2020) 1-17.
[39] Ahmad, A. and Dey, L., A k-mean clustering algorithm for mixed numeric and categorical data, Data & Knowledge Engineering63 (2007) 503-527.
[40] Minaei-Bidgoli, B., Asadi, M. and Parvin, H., An ensemble based approach for feature selection, in Engineering Applications of Neural Networks (Springer, 2011), pp. 240-246.
[41] Parvin, H., Minaei-Bidgoli, B. and Alinejad-Rokny, H., A new imbalanced learning and dictions tree method for breast cancer diagnosis, Journal of Bionanoscience7 (2013) 673-678.
[42] Masoudiasl, I., Vahdat, S., Hessam, S., Shamshirband, S. and Alinejad-Rokny, H., Proposing an integrated method based on fuzzy tuning and ica techniques to identify the most influencing features in breast cancer, Iranian Red Crescent Medical Journal21.
[43] Mahmoudi, M. R., Akbarzadeh, H., Parvin, H., Nejatian, S., Rezaie, V. and Alinejad-Rokny, H., Consensus function based on cluster-wise two level clustering, Artificial Intelligence Review (2020) 1-27.
[44] Mokhtari, S. M., Alinejad-Rokny, H. and Jalalifar, H., Selection of the best well control system by using fuzzy multiple-attribute decision-making methods, Journal of Applied Statistics41 (2014) 1105-1121. · Zbl 1352.93068
[45] Ahmadinia, M., Meybodi, M. R., Esnaashari, M. and Alinejad-Rokny, H., Energy-efficient and multi-stage clustering algorithm in wireless sensor networks using cellular learning automata, IETE Journal of Research59 (2013) 774-782.
[46] Zhou, J., Lai, Z., Miao, D., Gao, C. and Yue, X., Multigranulation rough-fuzzy clustering based on shadowed sets, Information Sciences507 (2020) 553-573. · Zbl 1456.62124
[47] F. Chung and H. Rhee, Uncertain fuzzy clustering: Insights and recommendations, IEEE Computational Intelligence Magazine2 (2007) 44-56.
[48] Wu, K.-L., Analysis of parameter selections for fuzzy c-means, Pattern Recognition45 (2012) 407-415. · Zbl 1225.68237
[49] Zhou, K. and Yang, S., Fuzzifier selection in fuzzy c-means from cluster size distribution perspective, Informatica30 (2019) 613-628.
[50] Diday, E. and Simon, J., Clustering analysis, in Digital Pattern Recognition (Springer, 1976), pp. 47-94. · Zbl 0331.62043
[51] Hubert, L. and Arabie, P., Comparing partitions, Journal of Classification2 (1985) 193-218. · Zbl 0587.62128
[52] Campello, R. J. and Hruschka, E. R., A fuzzy extension of the silhouette width criterion for cluster analysis, Fuzzy Sets and Systems157 (2006) 2858-2875. · Zbl 1103.68674
[53] Uci repository of machine learning databases, university of california, department of information and computer science, irvine, ca, https://archive.ics.uci.edu/ml/index.php, 2020, accessed: 2020-06.
[54] Barnes, R., Dhanoa, M. S. and Lister, S. J., Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra, Applied Spectroscopy43 (1989) 772-777.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.