×

Clustering directional data through depth functions. (English) Zbl 07734167

Summary: A new depth-based clustering procedure for directional data is proposed. Such method is fully non-parametric and has the advantages to be flexible and applicable even in high dimensions when a suitable notion of depth is adopted. The introduced technique is evaluated through an extensive simulation study. In addition, a real data example in text mining is given to explain its effectiveness in comparison with other existing directional clustering algorithms.

MSC:

62-08 Computational methods for problems pertaining to statistics

References:

[1] Ackermann, H., A note on circular nonparametrical classification, Biom J, 39, 5, 577-587 (1997) · Zbl 0882.62052 · doi:10.1002/bimj.4710390506
[2] Agostinelli, C.; Romanazzi, M., Nonparametric analysis of directional data based on data depth, Environ Ecol Stat, 20, 2, 253-270 (2013) · doi:10.1007/s10651-012-0218-z
[3] Arthur D, Vassilvitskii \(S (2007) k\)-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, pp 1027-1035 · Zbl 1302.68273
[4] Banerjee A, Dhillon I, Ghosh J, Sra S (2003) Generative model-based clustering of directional data. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 19-28
[5] Banerjee, A.; Dhillon, IS; Ghosh, J.; Sra, S., Clustering on the unit hypersphere using von Mises-Fisher distributions, J Mach Learn Res, 6, 1345-1382 (2005) · Zbl 1190.62116
[6] Benjamin BMJ, Hussain I, Yang MS (2019) Possiblistic c-means clustering on directional data. In: 2019 12th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI), pp 1-6 doi:10.1109/CISP-BMEI48845.2019.8965703
[7] Buttarazzi, D.; Pandolfo, G.; Porzio, GC, A boxplot for circular data, Biometrics, 74, 4, 1492-1501 (2018) · doi:10.1111/biom.12889
[8] D’Ambrosio A (2021) ConsRankClass: classification and clustering of preference rankings. R package version 101 https://CRAN.R-project.org/package=ConsRankClass
[9] D’Ambrosio, A.; Amodio, S.; Iorio, C.; Pandolfo, G.; Siciliano, R., Adjusted concordance index: an extension of the adjusted rand index to fuzzy partitions, J Classif, 38, 1, 112-128 (2021) · Zbl 07370655 · doi:10.1007/s00357-020-09367-0
[10] Demni H, Porzio GC (2021) Directional DD-classifiers under non-rotational symmetry. In: 2021 IEEE international conference on multisensor fusion and integration for intelligent systems (MFI), pp 1-6 doi:10.1109/MFI52462.2021.9591189
[11] Demni, H.; Messaoud, A.; Porzio, GC, The cosine depth distribution classifier for directional data, Applications in statistical computing, 49-60 (2019), Cham: Springer, Cham · doi:10.1007/978-3-030-25147-5_4
[12] Dhillon, IS; Modha, DS, Concept decompositions for large sparse text data using clustering, Mach Learn, 42, 1, 143-175 (2001) · Zbl 0970.68167 · doi:10.1023/A:1007612920971
[13] Dhillon, IS; Modha, DS, Concept decompositions for large sparse text data using clustering, Mach Learn, 42, 1-2, 143-175 (2001) · Zbl 0970.68167 · doi:10.1023/A:1007612920971
[14] Dhillon, IS; Sra, S., Modeling data using directional distributions (2003), Citeseer: Tech. rep, Citeseer
[15] Di Marzio, M.; Fensore, S.; Panzera, A.; Taylor, CC, Kernel density classification for spherical data, Stat Probab Lett, 144, 23-29 (2019) · Zbl 1407.62184 · doi:10.1016/j.spl.2018.07.018
[16] Ding Y, Dang X, Peng H, Wilkins D (2007) Robust clustering in high dimensional data using statistical depths. In: BMC bioinformatics, BioMed Central
[17] Fernandes, K.; Cardoso, JS, Discriminative directional classifiers, Neurocomputing, 207, 141-149 (2016) · doi:10.1016/j.neucom.2016.03.076
[18] Franke, J.; Redenbach, C.; Zhang, N., On a mixture model for directional data on the sphere, Scand J Stat, 43, 1, 139-155 (2016) · Zbl 1364.62123 · doi:10.1111/sjos.12169
[19] Hoberg, R., Cluster analysis based on data depth, Data analysis, classification, and related methods, 17-22 (2000), Cham: Springer, Cham · doi:10.1007/978-3-642-59789-3_2
[20] Hornik, K.; Grün, B., movmf: an r package for fitting mixtures of von mises-fisher distributions, J Stat Softw, 58, 10, 1-31 (2014) · doi:10.18637/jss.v058.i10
[21] Hornik K, Feinerer I, Kober M, Buchta C (2017) skmeans: spherical k-means clustering. R package version 02-11 https://CRAN.R-project.org/package=skmeans
[22] Hubert, L.; Arabie, P., Comparing partitions, J Classif, 2, 1, 193-218 (1985) · Zbl 0587.62128 · doi:10.1007/BF01908075
[23] Hüllermeier, E.; Rifqi, M.; Henzgen, S.; Senge, R., Comparing fuzzy partitions: a generalization of the Rand index and related measures, Fuzzy Syst IEEE Trans, 20, 3, 546-556 (2012) · doi:10.1109/TFUZZ.2011.2179303
[24] Jeong MH, Cai Y, Sullivan CJ, Wang S (2016) Data depth based clustering analysis. In: Proceedings of the 24th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM
[25] Jörnsten, R., Clustering and classification based on the \({L}_1\) data depth, J Multivar Anal, 90, 1, 67-89 (2004) · Zbl 1047.62064 · doi:10.1016/j.jmva.2004.02.013
[26] Kaufman, L.; Rousseeuw, P.; Dodge, Y., Clustering by means of medoids, Statistical data analysis based on the \(L_1\)-norm and related methods, 405-416 (1987), Amsterdam: North-Holland Publishing Co., Amsterdam
[27] Kesemen, O.; Tezel, Ö.; Özkul, E., Fuzzy c-means clustering algorithm for directional data (fcm4dd), Expert Syst Appl, 58, 76-82 (2016) · doi:10.1016/j.eswa.2016.03.034
[28] Ley, C.; Verdebout, T., Modern directional statistics (2017), Florida: Chapman and Hall/CRC, Florida · Zbl 1448.62005 · doi:10.1201/9781315119472
[29] Ley, C.; Verdebout, T., Applied directional statistics: modern methods and case studies (2018), Florida: CRC Press, Florida · Zbl 1397.62004 · doi:10.1201/9781315228570
[30] Ley, C.; Sabbah, C.; Verdebout, T., A new concept of quantiles for directional data and the angular Mahalanobis depth, Electron J Stat, 8, 1, 795-816 (2014) · Zbl 1349.62197 · doi:10.1214/14-EJS904
[31] Liu, R.; Singh, K., Ordering directional data: concepts of data depth on circles and spheres, J Am Stat Assoc, 20, 3, 1468-1484 (1992) · Zbl 0766.62027
[32] López-Cruz, PL; Bielza, C.; Larranaga, P., Directional naive Bayes classifiers, Pattern Anal Appl, 18, 2, 225-246 (2015) · Zbl 1428.62283 · doi:10.1007/s10044-013-0340-z
[33] Mardia, KV; Jupp, P., Directional statistics (2000), Chichester: Wiley, Chichester · Zbl 0935.62065
[34] Pandolfo G (2022) The GLD-plot: a depth-based graphical tool to investigate unimodality of directional data. J Stat Comput Simul 1-14 · Zbl 07551359
[35] Pandolfo, G.; D’Ambrosio, A., Depth-based classification of directional data, Expert Syst Appl, 169, 114, 433 (2021) · doi:10.1016/j.eswa.2020.114433
[36] Pandolfo, G.; Paindaveine, D.; Porzio, GC, Distance-based depths for directional data, Can J Stat, 46, 4, 593-609 (2018) · Zbl 1492.62095 · doi:10.1002/cjs.11479
[37] Romanazzi, M., Data depth, random simplices and multivariate dispersion, Stat Probab Lett, 79, 12, 1473-1479 (2009) · Zbl 1165.62040 · doi:10.1016/j.spl.2009.03.022
[38] Rousseeuw, PJ, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J Comput Appl Math, 20, 53-65 (1987) · Zbl 0636.62059 · doi:10.1016/0377-0427(87)90125-7
[39] SenGupta, A.; Roy, S., A simple classification rule for directional data, Advances in ranking and selection, multiple comparisons, and reliability, 81-90 (2005), Cham: Springer, Cham · doi:10.1007/0-8176-4422-9_5
[40] Taghia, J.; Ma, Z.; Leijon, A., Bayesian estimation of the von-mises fisher mixture model with variational inference, IEEE Trans Pattern Anal Mach Intell, 36, 9, 1701-1715 (2014) · doi:10.1109/TPAMI.2014.2306426
[41] Torrente, A.; Romo, J., Initializing k-means clustering by bootstrap and data depth, J Classif, 38, 2, 232-256 (2021) · Zbl 07413946 · doi:10.1007/s00357-020-09372-3
[42] Tsagris, M.; Alenazi, A., Comparison of discriminant analysis methods on the sphere, Commun Stat Case Stud Data Anal Appl, 5, 4, 467-491 (2019)
[43] Yang, MS; Pan, JA, On fuzzy clustering of directional data, Fuzzy Sets Syst, 91, 3, 319-326 (1997) · Zbl 0921.62077 · doi:10.1016/S0165-0114(96)00157-1
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.