×

Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. (English) Zbl 1059.68120

Summary: The Self-Organizing Map (SOM) has been widely used in many industrial applications. Classical clustering methods based on the SOM often fail to deliver satisfactory results, specially when clusters have arbitrary shapes. In this paper, through some preprocessing techniques for filtering out noises and outliers, we propose a new two-level SOM-based clustering algorithm using a clustering validity index based on inter-cluster and intra-cluster density. Experimental results on synthetic and real data sets demonstrate that the proposed clustering algorithm is able to cluster data better than the classical clustering algorithms based on the SOM, and find an optimal number of clusters.

MSC:

68T10 Pattern recognition, speech recognition

Software:

UCI-ml
Full Text: DOI

References:

[1] Kohonen, T., Self-organized formation of topologically correct feature maps, Biol. Cybern., 43, 59-69 (1982) · Zbl 0466.92002
[2] Kohonen, T., Self-Organizing Maps (1997), Springer: Springer Berlin, Germany · Zbl 0866.68085
[3] Gray, R. M., Vector quantization, IEEE Acoust., Speech, Signal Process. Mag., 1, 2, 4-29 (1984)
[4] Pal, N. R.; Bezdek, J. C.; Tsao, E. C.-K., Generalized clustering networks and Kohonen’s self-organizing scheme, IEEE Trans. Neural Networks, 4, 4, 549-557 (1993)
[5] Jain, A. K.; Murty, M. N.; Flyn, P. J., Data clusteringa review, ACM Comput. Surveys, 31, 3, 264-323 (1999)
[6] Han, J.; Kamber, M., Data mining: concepts and techniques (2000), Morgan-Kaufman: Morgan-Kaufman San Francisco
[7] Huntsberger, T.; Ajjimarangsee, P., Parallel self-organizing feature maps for unsupervised pattern recognition, Int. J. Gen. Systems, 16, 357-372 (1989)
[8] Mao, J.; Jain, A. K., A self-organizing network for hyperellipsoidal clustering (HEC), IEEE Trans. Neural Networks, 7, 1, 16-29 (1996)
[9] Lampinen, J.; Oja, E., Clustering properties of hierarchical self-organizing maps, J. Math. Imag. Vis., 2, 2-3, 261-272 (1992) · Zbl 0790.92002
[10] Murtagh, F., Interpreting the Kohonen self-organizing feature map using contiguity-constrained clustering, Pattern Recognition Lett., 16, 399-408 (1995)
[11] Kiang, M. Y., Extending the Kohonen self-organizing map networks for clustering analysis, Comput. Stat. Data Anal., 38, 161-180 (2001) · Zbl 1095.62509
[12] Vesanto, J.; Alhonierni, E., Clustering of the self-organizing map, IEEE Trans. Neural Networks, 11, 3, 586-600 (2000)
[13] Lopson, H.; Siegelmann, H. T., Clustering irregular shapes using high-order neurons, Neural Computation, 12, 10, 2331-2353 (2000)
[14] M. Halkidi, M. Vazirgiannis, Clustering validity assessment using multi representatives, Proceedings of SETN Conference, Thessaloniki, Greece, April 2002.; M. Halkidi, M. Vazirgiannis, Clustering validity assessment using multi representatives, Proceedings of SETN Conference, Thessaloniki, Greece, April 2002. · Zbl 1009.68665
[15] A. Ultsch, H.P. Siemon, Kohonen’s self organizing feature maps for exploratory data analysis, Proceedings of the International Neural Network Conference, Dordrecht, Netherlands, 1990, pp. 305-308.; A. Ultsch, H.P. Siemon, Kohonen’s self organizing feature maps for exploratory data analysis, Proceedings of the International Neural Network Conference, Dordrecht, Netherlands, 1990, pp. 305-308.
[16] X. Zhang, Y. Li, Self-organizing map as a new method for clustering and data analysis, Proceedings of the International Joint Conference on Neural Networks, Nagoya, Japan, 1993, pp. 2448-2451.; X. Zhang, Y. Li, Self-organizing map as a new method for clustering and data analysis, Proceedings of the International Joint Conference on Neural Networks, Nagoya, Japan, 1993, pp. 2448-2451.
[17] S. Guha, R. Rastogi, K. Shim, CURE: an efficient clustering algorithm for large databases, Proceedings of ACM SIGMOD International Conference on Management of Data, New York, 1998, pp. 73-84.; S. Guha, R. Rastogi, K. Shim, CURE: an efficient clustering algorithm for large databases, Proceedings of ACM SIGMOD International Conference on Management of Data, New York, 1998, pp. 73-84. · Zbl 1006.68661
[18] Karypis, G.; Han, E.-H.; Kumar, V., Chameleonhierarchical clustering using dynamic modeling, IEEE Comput., 32, 8, 68-74 (1999)
[19] Theodoridis, S.; Koutroubas, K., Pattern Recognition (1999), Academic Press: Academic Press New York
[20] Dunn, J. C., Well separated clusters and optimal fuzzy partitions, J. Cycbern., 4, 95-104 (1974) · Zbl 0304.68093
[21] Xie, X. L.; Beni, G., A validity measure for fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell., 13, 8, 841-847 (1991)
[22] Milligan, G. W.; Soon, S. C.; Sokol, L. M., The effect of cluster size, dimensionality and number of clusters on recovery of true cluster structure, IEEE Trans. Pattern Anal. Mach. Intell., 5, 40-47 (1983)
[23] Dave, R. N., Validating fuzzy partitions obtained through c-shell clustering, Pattern Recognition Lett., 17, 613-623 (1996)
[24] Fisher, R. A., The use of multiple measure in taxonomic problems, Ann. Eugenics, 7, Part II, 179-188 (1936)
[25] Bezdek, J. C.; Pal, N. R., Some new indexes of cluster validity, IEEE Trans. System, Man, and Cybern., 28, 3, 301-315 (1998)
[26] C.L. Blake, C.J. Merz, UCI repository of machine learning databases, (http://www.ics.uci.edu/ mlearn/MLRepository.html), Department of Information and Computer Science, University of California at Irvine, CA, 1998.; C.L. Blake, C.J. Merz, UCI repository of machine learning databases, (http://www.ics.uci.edu/ mlearn/MLRepository.html), Department of Information and Computer Science, University of California at Irvine, CA, 1998.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.