×

A classification EM algorithm for clustering and two stochastic versions. (English) Zbl 0937.62605

Summary: Setting the optimization-based clustering methods under the classification maximum likelihood approach, we define and study a general Classification EM algorithm. Then, we derive from this algorithm two stochastic algorithms, incorporating random perturbations, to reduce the initial-position dependence of the classical optimization clustering algorithms. Numerical experiments, reported for the variance criterion, show that both stochastic algorithms perform well compared with the standard \(k\)-means algorithm which is a particular version of the Classification EM algorithm.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI

References:

[1] Arthanary, T. S.; Dodge, Y.: Mathematical programming in statistics. (1981) · Zbl 0549.62002
[2] Bezdek, J. C.; Hathaway, R. J.; Howard, R. E.; Wilson, C. A.; Windham, M. P.: Local convergence analysis of a grouped variable version of coordinate descent. Journal of optimization theory and application 54, 471-477 (1987) · Zbl 0597.90076
[3] Bryant, P.: Large-sample results for optimization based clustering methods. Journal of classification 8, 31-44 (1991) · Zbl 0747.62057
[4] Bryant, P.; Williamson, J. A.: Asymptotic behaviour of classification maximum likelihood estimates. Biometrika 65, 273-281 (1978) · Zbl 0393.62011
[5] Bryant, P.; Williamson, J. P.: Maximum likelihood and classification: A comparison of three approaches. Classification as a tool of research, 33-45 (1986) · Zbl 0587.62103
[6] Celeux, G.: Classification et modèles. Revue de statistique appliquée 36, No. 4, 43-58 (1988) · Zbl 0972.62527
[7] Celeux, G.; Diebolt, J.: The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational statistics quarterly 2, 73-82 (1985)
[8] Celeux, G.; Diebolt, J.: Comportement asymptotique d’un algorithme d’apprentissage probabiliste pour LES mélanges de lois de probabilité. Rapport de recherche INRIA (1986) · Zbl 0607.62037
[9] Celeux, G.; Diebolt, J.: Une version de type recuit simulé de l’algorithme EM. C. R. Acad. sci. Série 1 310, 119-124 (1990) · Zbl 0693.62035
[10] Celeux, G.; Govaert, G.: Clustering criteria for discrete data and latent class, models. Journal of classification 8, 157-176 (1991) · Zbl 0775.62150
[11] Dempster, A. P.; Laird, N. M.; Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the royal statistical society serie B 39, 1-38 (1977) · Zbl 0364.62022
[12] Ganesalingam, S.: Classification and mixture approach to clustering via maximum likelihood. Applied statistics 38, 455-466 (1989) · Zbl 0707.62121
[13] Hathaway, R. J.: Another interpretation of the EM algorithm for mixture distributions. Statistics & probability letters 4, 53-56 (1986) · Zbl 0585.62052
[14] Jain, A. K.; Dubes, R. C.: Algorithms for clustering data. (1988) · Zbl 0665.62061
[15] Klein, R. W.; Dubes, R. C.: Experiments in projection and clustering by simulated annealing. Pattern recognition 22, 213-220 (1989) · Zbl 0709.62613
[16] Marriott, F. H. C.: Separating mixtures of normal distributions. Biometrics 31, 767-769 (1975) · Zbl 0308.62050
[17] Redner, R. A.; Walker, H. F.: Mixtures densities, maximum likelihood and the EM algorithm. SIAM review 26, 195-239 (1984) · Zbl 0536.62021
[18] Sandor, G.; Lechevallier, Y.: Découpage optimal de variables quantitatives et application à la définition d’une grille de diagnostic tirée de l’études des protéines sériques. First international symposium on data analysis and informatics, 665-674 (1977)
[19] Scott, A. J.; Symons, M. J.: Clustering methods based on likelihood ratio criteria. Biometrics 27, 387-397 (1971)
[20] Späth, H.: Cluster dissection and analysis. (1985) · Zbl 0584.62094
[21] Symons, M. J.: Clustering criteria and multivariate normal mixture. Biometrics 37, 35-43 (1981) · Zbl 0473.62048
[22] Titterington, D. M.; Smith, A. F. M.; Makov, U. E.: Statistical analysis of finite mixture distribution. (1985) · Zbl 0646.62013
[23] Van Laarhoven, P. J. M.; Aarts, E. H. L.: Simulated annealing: theory and applications. (1987) · Zbl 0643.65028
[24] Windham, M. P.: Parameter modification for clustering criteria. Journal of classification 4, 191-214 (1987) · Zbl 0662.62066
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.