×

Self-updating clustering algorithm for estimating the parameters in mixtures of von Mises distributions. (English) Zbl 1514.62109

Summary: The EM algorithm is the standard method for estimating the parameters in finite mixture models. M.-S. Yang and J.-A. Pan [Fuzzy Sets Syst. 91, No. 3, 319–326 (1997; Zbl 0921.62077)] proposed a generalized classification maximum likelihood procedure, called the fuzzy \(c\)-directions (FCD) clustering algorithm, for estimating the parameters in mixtures of von Mises distributions. Two main drawbacks of the EM algorithm are its slow convergence and the dependence of the solution on the initial value used. The choice of initial values is of great importance in the algorithm-based literature as it can heavily influence the speed of convergence of the algorithm and its ability to locate the global maximum. On the other hand, the algorithmic frameworks of EM and FCD are closely related. Therefore, the drawbacks of FCD are the same as those of the EM algorithm. To resolve these problems, this paper proposes another clustering algorithm, which can self-organize local optimal cluster numbers without using cluster validity functions. These numerical results clearly indicate that the proposed algorithm is superior in performance of EM and FCD algorithms. Finally, we apply the proposed algorithm to two real data sets.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H11 Directional data; spatial statistics
62H86 Multivariate analysis and fuzziness

Citations:

Zbl 0921.62077

Software:

circular; CircStats
Full Text: DOI

References:

[1] Banerjee, A., Dhillon, I. S., Ghosh, J. and Sra, S. 2005. Clustering on the unit hypersphere using von Mises-Fisher distributions. J. Mach. Learn. Res., 6: 1345-1382. · Zbl 1190.62116
[2] Bartels, R. 1984. Estimation in a bidirectional mixture of von Mises distributions. Biometrics, 40: 777-784. · doi:10.2307/2530921
[3] Batschelet, E. 1981. “Circular Statistics in Biology”. London: Academic Press. · Zbl 0524.62104
[4] Chang-Chien, S. J., Hung, W. L. and Yang, M. S. 2012. On mean shift-based clustering for circular data. Soft Comput., 16(6): 1043-1060. · doi:10.1007/s00500-012-0802-z
[5] Chang-Chien, S. J., Yang, M. S. and Hung, W. L. Mean shift-based clustering for directional data. Proceedings of Third International Workshop on Advanced Computational Intelligence. August25-27, Suzhou, China. pp. 367-372.
[6] Chen, T. L. 2009. “Image segmentation by SUP clustering algorithm”. Washington, DC: Section on Statistical Learning and Data Mining - JSM 2009.
[7] Chen, T. L. and Shiu, S. Y. A new clustering algorithm based on self-updating process. Proceedings of the American Statistical Association, Statistical Computing Section [CD-ROM]. Salt Lake City, Utah.
[8] Dempster, A. P., Laird, N. M. and Rubin, D. B. 1977. Maximum likelihood estimation from incomplete data via the EM algorithm with discussion. J. R. Stat. Soc. B, 39: 1-38. · Zbl 0364.62022
[9] Dortet-Bernadet, J. L. and Wicker, N. 2008. Model-based clustering on the unit sphere with an illustration using gene expression profiles. Biostatistics, 9: 66-80. · Zbl 1274.62761 · doi:10.1093/biostatistics/kxm012
[10] Fisher, N. I. 1993. “Statistical Analysis of Circular Data”. Cambridge: Cambridge University Press. · Zbl 0788.62047 · doi:10.1017/CBO9780511564345
[11] Fisher, N. I., Lewis, T. and Embleton, B. J.J. 1987. “Statistical Analysis of Spherical Data”. Cambridge: University Press. · Zbl 0651.62045 · doi:10.1017/CBO9780511623059
[12] Jammalamadaka, S. R. and SenGupta, A. 2001. “Topics in Circular Statistics”. Singapore: World Scientific. · Zbl 1006.62050 · doi:10.1142/4031
[13] Lee, A. 2010. Circular data. WIREs Comput. Stat., 2: 477-486. · doi:10.1002/wics.98
[14] Mardia, K. V. 1972. “Statistics of Directional Data”. London: Academic Press. · Zbl 0244.62005
[15] Mardia, K. V. and Jupp, P. E. 2000. “Directional Statistics”. Chichester: Wiley. · Zbl 0935.62065
[16] McGraw, T., Vemuri, B. C., Yezierski, B. and Mareci, T. 2006. von Mises-Fisher mixture model of the diffusion ODF. Proceedings of the 3rd IEEE International Symposium on Biomedical Imaging: Macro to Nano (ISBI 2006). April6-92006, Arlington, VA, USA. pp. 65-68.
[17] McLachlan, G. J. and Peel, D. 2000. “Finite Mixture Models”. New York: Wiley. · Zbl 0963.62061 · doi:10.1002/0471721182
[18] von Mises, R. 1918. Uber die “Ganzzahiligkeit” der Atomgewicht und verwandte Fragen. Physikal Z., 19: 490-500. · JFM 46.1493.01
[19] Mooney, J. A., Helms, P. J. and Jolliffe, I. T. 2003. Fitting mixtures of von Mises distributions: a case study involving sudden infant death syndrome. Comput. Stat. Data Anal., 41: 505-513. · Zbl 1430.62235 · doi:10.1016/S0167-9473(02)00181-0
[20] Ruspini, E. H. 1969. A new approach to clustering. Inform. Control, 15: 22-32. · Zbl 0192.57101 · doi:10.1016/S0019-9958(69)90591-9
[21] Spurr, B. D. and Koutbeiy, M. A. 1991. A comparison of various methods for estimating the parameters in mixtures of von Mises distributions. Commun. Stat. Simul. Comput., 20: 725-741. · Zbl 0850.62249
[22] Stephens, M. A. 1969. “Techniques for directional data”. In Tech. Report no. 150, Stanford, CA: Department of Statistics, Stanford University.
[23] Thang, N. D., Chen, L. and Chan, C. K. 2008. Feature reduction using mixture model of directional distributions. Proceedings of the 10th International Conference on Control, Automation, Robotics and Vision, ICARCV. 17-202008, December, Vietnam. pp. 2208-2212.
[24] Wu, K. L. and Yang, M. S. 2007. Mean shift-based clustering. Pattern Recognit., 40: 3035-3052. · Zbl 1118.68645 · doi:10.1016/j.patcog.2007.02.006
[25] Yang, M. S. and Pan, J. A. 1997. On fuzzy clustering of directional data. Fuzzy Sets Syst., 91: 319-326. · Zbl 0921.62077 · doi:10.1016/S0165-0114(96)00157-1
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.