×

Improved initialisation of model-based clustering using Gaussian hierarchical partitions. (English) Zbl 1414.62272

Summary: Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mclust R package. This choice is computationally convenient and often yields good clustering partitions. However, in certain circumstances, poor initial partitions may cause the EM algorithm to converge to a local maximum of the likelihood function. We propose several simple and fast refinements based on data transformations and illustrate them through data examples.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

Mixmod; clusfind; PGMM; mclust; Flury; R

References:

[1] Auder B, Lebret R, Lovleff S, Langrognet F (2014) Rmixmod: an interface for MIXMOD. http://CRAN.R-project.org/package=Rmixmod, R package version 2.0.2
[2] Banfield J, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803-821 · Zbl 0794.62034 · doi:10.2307/2532201
[3] Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719-725 · doi:10.1109/34.865189
[4] Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561-575 · Zbl 1429.62235 · doi:10.1016/S0167-9473(02)00163-9
[5] Biernacki C, Celeux G, Govaert G, Langrognet F (2006) Model-based cluster and discriminant analysis with the MIXMOD software. Comput Stat Data Anal 51:587-600 · Zbl 1157.62431 · doi:10.1016/j.csda.2005.12.015
[6] Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28:781-793 · doi:10.1016/0031-3203(94)00125-6
[7] Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Series B Stat Methodol 39:1-38 · Zbl 0364.62022
[8] Everitt B, Landau S, Leese M, Stahl D (2011) Cluster analysis, 5th edn. Wiley, Chichester, UK · Zbl 1274.62003 · doi:10.1002/9780470977811
[9] Flury B (1997) A first course in multivariate statistics. Springer, New York · Zbl 0879.62052 · doi:10.1007/978-1-4757-2765-4
[10] Forina M, Armanino C, Castino M, Ubigli M (1986) Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25:189-201
[11] Fraley C (1998) Algorithms for model-based Gaussian hierarchical clustering. SIAM J Sci Compu 20(1):270-281 · Zbl 0911.62052 · doi:10.1137/S1064827596311451
[12] Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41:578-588 · Zbl 0920.68038 · doi:10.1093/comjnl/41.8.578
[13] Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611-631 · Zbl 1073.62545 · doi:10.1198/016214502760047131
[14] Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) MCLUST version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. Technical Report 597, Department of Statistics, University of Washington · Zbl 1520.62002
[15] Fraley C, Raftery AE, Scrucca L (2015) mclust: normal mixture modelling for model-based clustering, classification, and density estimation. http://CRAN.R-project.org/package=mclust, R package version 5.0.1 · Zbl 1520.62002
[16] Gordon AD (1999) Classification, 2nd edn. Chapman & Hall/CRC · Zbl 0929.62068
[17] Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193-218 · doi:10.1007/BF01908075
[18] Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc · Zbl 0665.62061
[19] Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, UK · Zbl 1345.62009
[20] Maitra R (2009) Initializing partition-optimization algorithms. IEEE/ACM Trans Comput Biol Bioinform 6(1):144-157 · doi:10.1109/TCBB.2007.70244
[21] McLachlan G, Krishnan T (2008) The EM algorithm and extensions, 2nd edn. Wiley-Interscience, Hoboken, New Jersey · Zbl 1165.62019 · doi:10.1002/9780470191613
[22] McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061 · doi:10.1002/0471721182
[23] McLachlan GJ (1988) On the choice of starting values for the EM algorithm in fitting mixture models. Statistician 37(4/5):417 · doi:10.2307/2348768
[24] McNicholas PD, ElSherbiny A, McDaid AF, Murphy TB (2015) pgmm: Parsimonious Gaussian Mixture Models. http://CRAN.R-project.org/package=pgmm, R package version 1.2
[25] Melnykov V, Maitra R (2010) Finite mixture models and model-based clustering. Stat Surv 4:80-116 · Zbl 1190.62121 · doi:10.1214/09-SS053
[26] Melnykov V, Melnykov I (2012) Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput Stat Data Anal 56(6):1381-1395 · Zbl 1246.65025 · doi:10.1016/j.csda.2011.11.002
[27] Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Multivar Behav Res 21(4):441-458 · doi:10.1207/s15327906mbr2104_5
[28] Raftery AE, Dean N (2006) Variable selection for model-based clustering. J Am Stat Assoc 101(473):168-178 · Zbl 1118.62339 · doi:10.1198/016214506000000113
[29] Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6:31-38
[30] Wu CJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11(1):95-103 · Zbl 0517.62035 · doi:10.1214/aos/1176346060
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.