×

Robustness of density-based clustering methods with various neighborhood relations. (English) Zbl 1185.68555

Summary: Cluster analysis is one of the most crucial techniques in statistical data analysis. Among the clustering methods, density-based methods have great importance due to their ability to recognize clusters with arbitrary shape. In this paper, robustness of the clustering methods is handled. These methods use distance-based neighborhood relations between points. In particular, DBSCAN (density-based spatial clustering of applications with noise) algorithm and FN-DBSCAN (fuzzy neighborhood DBSCAN) algorithm are analyzed. FN-DBSCAN algorithm uses fuzzy neighborhood relation whereas DBSCAN uses crisp neighborhood relation. The main characteristic of the FN-DBSCAN algorithm is that it combines the speed of the DBSCAN and robustness of the NRFJP (noise robust fuzzy joint points) algorithms. It is observed that the FN-DBSCAN algorithm is more robust than the DBSCAN algorithm to datasets with various shapes and densities.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition
Full Text: DOI

References:

[1] Abraham, T.; Roddick, J. F., Survey of spatio-temporal databases, GeoInformatica, 3, 1, 61-99 (1999)
[2] M. Ankerst, M.M. Breunig, H.-P. Kriegel, J. Sander, OPTICS: ordering points to identify the clustering structure, in: Proc. ACM SIGMOD Internat. Conf. on Management of Data, Philadelphia, PA, 1999, pp. 49-60.; M. Ankerst, M.M. Breunig, H.-P. Kriegel, J. Sander, OPTICS: ordering points to identify the clustering structure, in: Proc. ACM SIGMOD Internat. Conf. on Management of Data, Philadelphia, PA, 1999, pp. 49-60.
[3] Aoying, Z.; Shuigeng, Z., Approaches for scaling DBSCAN algorithm to large spatial database, Journal of Computer Science and Technology, 15, 6, 509-526 (2000) · Zbl 0970.68583
[4] Bensaid, A. M.; Hall, L. O.; Bezdek, J. C.; Clarke, L. P.; Silbiger, M. L.; Arrington, J. A.; Murtagh, R. F., Validity-guided (re)clustering with applications to image segmentation, IEEE Transactions on Fuzzy Systems, 4, 2, 112-123 (1996)
[5] Birant, D.; Kut, A., ST-DBSCAN: an algorithm for clustering spatial-temporal data, Data & Knowledge Engineering, 60, 208-221 (2007)
[6] Dong, Y.; Zhuang, Y.; Chen, K.; Tai, X., A hierarchical clustering algorithm based on fuzzy graph connectedness, Fuzzy Sets and Systems, 157, 1760-1774 (2006) · Zbl 1100.68105
[7] Duan, L.; Xu, L.; Guo, F.; Lee, J.; Yan, B., A local-density based spatial clustering algorithm with noise, Information Systems, 32, 978-986 (2007)
[8] Dunn, J. C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, 3, 3, 32-57 (1973) · Zbl 0291.68033
[9] M. Ester, H.P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proc. Second Internat. Conf. on Knowledge Discovery and Data Mining, 1996, pp. 226-231.; M. Ester, H.P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proc. Second Internat. Conf. on Knowledge Discovery and Data Mining, 1996, pp. 226-231.
[10] S. Guha, R. Rastogi, K. Shim, CURE: an efficient clustering algorithms for large databases, in: Proc. ACM SIGMOD Internat. Conf. on Management of Data, Seattle, WA, 1998, pp. 73-84.; S. Guha, R. Rastogi, K. Shim, CURE: an efficient clustering algorithms for large databases, in: Proc. ACM SIGMOD Internat. Conf. on Management of Data, Seattle, WA, 1998, pp. 73-84. · Zbl 1006.68661
[11] Hammah, R. E.; Curran, J. H., On distance measures for the fuzzy K-means algorithm for joint data, Rock Mechanics and Rock Engineering, 32, 1, 1-27 (1999)
[12] J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco, CA, 2001, pp. 335-391.; J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, San Francisco, CA, 2001, pp. 335-391.
[13] Han, J.; Kamber, M.; Tung, A. K.H., Spatial clustering methods in data mining: a survey, (Miller, H.; Han, J., Geographic Data Mining and Knowledge Discovery (2001), Taylor & Francis: Taylor & Francis London)
[14] H.-P. Kriegel, K. Kailing, A. Pryakin, M., Schubert, Clustering multi-represented objects with noise, PAKDD, 2004, pp. 394-403.; H.-P. Kriegel, K. Kailing, A. Pryakin, M., Schubert, Clustering multi-represented objects with noise, PAKDD, 2004, pp. 394-403.
[15] H.-P. Kriegel, M. Pfeifle, Density-based clustering of uncertain data, in: Proc. 11th ACM SIGKDD Internat. Conf. on Knowledge Discovery in Data Mining, 2005, pp. 672-677.; H.-P. Kriegel, M. Pfeifle, Density-based clustering of uncertain data, in: Proc. 11th ACM SIGKDD Internat. Conf. on Knowledge Discovery in Data Mining, 2005, pp. 672-677.
[16] E.N. Nasibov, An alternative fuzzy-hierarchical approach to cluster analysis, in: Proc. Seventh Internat. Conf. on Application of Fuzzy Systems and Soft Computing, Siegen, Germany, 2006, pp. 113-123.; E.N. Nasibov, An alternative fuzzy-hierarchical approach to cluster analysis, in: Proc. Seventh Internat. Conf. on Application of Fuzzy Systems and Soft Computing, Siegen, Germany, 2006, pp. 113-123.
[17] Nasibov, E. N., A robust algorithm for fuzzy clustering problem on the base of fuzzy joint points method, Cybernetics and Systems Analysis, 44, 1 (2008) · Zbl 1181.68253
[18] Nasibov, E. N.; Ulutagay, G., A new unsupervised approach for fuzzy clustering, Fuzzy Sets and Systems, 158, 2118-2133 (2007) · Zbl 1416.62373
[19] Nasibov, E. N.; Ulutagay, G., A new approach to clustering problem using the fuzzy joint points method, Automatic Control and Computer Sciences, 39, 6, 8-17 (2005)
[20] Nasibov, E. N.; Ulutagay, G., On the fuzzy joint points method for fuzzy clustering problem, Automatic Control and Computer Sciences, 40, 5, 33-44 (2006)
[21] Pal, N. R.; Bezdek, J. C., On cluster validity for the fuzzy c-means model, IEEE Transactions on Fuzzy Systems, 3, 3, 370-379 (1995)
[22] Pedrycz, W.; Gomide, F., An Introduction to Fuzzy Sets (1998), Massachusetts Institute
[23] M. Sadaaki, Y. Endo, S. Hayakawa, E. Kataoka, Classification and clustering of information objects based on fuzzy neighborhood system, in: IEEE Internat. Conf. on Systems, Man and Cybernetics, Hawaii, 2005, pp. 3210-3215.; M. Sadaaki, Y. Endo, S. Hayakawa, E. Kataoka, Classification and clustering of information objects based on fuzzy neighborhood system, in: IEEE Internat. Conf. on Systems, Man and Cybernetics, Hawaii, 2005, pp. 3210-3215.
[24] Samet, H., The Design and Analysis of Spatial Data Structures (1990), Addison-Wesley: Addison-Wesley Reading, MA
[25] Sander, J.; Ester, M.; Kriegel, H. P.; Xu, X., Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications, Data Mining and Knowledge Discovery, 2, 169-194 (1998)
[26] Velthuizen, R. P.; Hall, L. O.; Clarke, L. P.; Silbiger, M. L., An investigation of mountain method clustering for large data sets, Pattern Recognition, 30, 7, 1121-1135 (1997)
[27] Yager, R. R.; Filev, D. P., Approximate clustering via the mountain method, IEEE Transactions on Systems, Man and Cybernetics, 24, 8, 1279-1284 (1994)
[28] Zahid, N.; Abouelala, O.; Limouri, M.; Essaid, A., Fuzzy clustering based on K-nearest-neighbours rule, Fuzzy Sets and Systems, 120, 239-247 (2001) · Zbl 0981.62057
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.