×

Robust clustering by deterministic agglomeration EM of mixtures of multivariate \(t\)-distributions. (English) Zbl 1005.68051

Summary: This paper presents new robust clustering algorithms, which significantly improve upon the noise and initialization sensitivity of traditional mixture decomposition algorithms, and simplify the determination of the optimal number of clusters in the data set. The algorithms implement maximum likelihood mixture decomposition of multivariate \(t\)-distributions, a robust parametric extension of Gaussian mixture decomposition. We achieve improved convergence capability relative to the Expectation-Maximization (EM) approach by deriving Deterministic Annealing EM (DAEM) algorithms for this mixture model and turning them into agglomerative algorithms (going through a monotonically decreasing number of components), an approach we term deterministic agglomeration EM. Two versions are derived, based on two variants of DAEM for mixture models. Simulation studies demonstrate the algorithms’ performance for mixtures with isotropic and non-isotropic covariances in two and 10 dimensions with known or unknown levels of outlier contamination.

MSC:

68P10 Searching and sorting
68W05 Nonnumerical algorithms
Full Text: DOI

References:

[1] McLachlan, G. J.; Peel, D., Finite Mixture Models (2000), Wiley: Wiley New York · Zbl 0963.62061
[2] Jain, A. K.; Duin, R. P.W.; Mao, J., Statistical pattern recognition: a review, IEEE Trans. Pattern Anal. Mach. Intell., 22, 1, 4-37 (2000)
[3] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum likelihood from incomplete data using the EM algorithm (with discussion), J. R. Stat. Soc. B, 39, 1-39 (1977) · Zbl 0364.62022
[4] McLachlan, G. J.; Krishnan, T., The EM Algorithm and Extensions (1997), Wiley: Wiley New York · Zbl 0882.62012
[5] Ueda, N.; Nakano, R., Deterministic annealing EM algorithm, Neural Networks, 11, 2, 271-282 (1998)
[6] S. Medasani, R. Krishnapuram, Determination of the number of components in Gaussian mixtures using agglomerative clustering, Proceedings of the 1997 IEEE International Conference on Neural Networks, Houston, Texas, 1997.; S. Medasani, R. Krishnapuram, Determination of the number of components in Gaussian mixtures using agglomerative clustering, Proceedings of the 1997 IEEE International Conference on Neural Networks, Houston, Texas, 1997. · Zbl 0933.68120
[7] Lange, K. L.; Little, R. J.A.; Taylor, J. M.G., Robust statistical modeling using the \(t\) distribution, J. Amer. Stat. Assoc., 84, 408, 881-896 (1989)
[8] Peel, D.; McLachlan, G. J., Robust mixture modelling using the \(t\) distribution, Stat. Comput., 10, 339-348 (2000)
[9] M. Sahani, Latent Variable Models for Neural Data Analysis, Department of Computation and Neural Systems, California Institute of Technology, Pasadena, California, 1999. http://www.gatsby.ucl.ac.uk./∼maneesh/thesis/; M. Sahani, Latent Variable Models for Neural Data Analysis, Department of Computation and Neural Systems, California Institute of Technology, Pasadena, California, 1999. http://www.gatsby.ucl.ac.uk./∼maneesh/thesis/
[10] Rose, K., Deterministic annealing for clustering, compression, classification, regression, and related optimization problems, Proc. IEEE, 86, 11, 2210-2239 (1998)
[11] R.M. Neal, G.E. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants, in: M.I. Jordan (Ed.) Learning in Graphical Models, 1998, Kluwer Academic Publishers, Dordrecht, pp. 355-368.; R.M. Neal, G.E. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants, in: M.I. Jordan (Ed.) Learning in Graphical Models, 1998, Kluwer Academic Publishers, Dordrecht, pp. 355-368. · Zbl 0916.62019
[12] J. Puzicha, T. Hofmann, M.J. Buhmann. Deterministic Annealing: fast physical heuristics for real time optimization of large systems, in: Proceedings of the 15th IMACS World Congress on Scientific Computation, Modeling and Applied Mathematics, Berlin, 1997.; J. Puzicha, T. Hofmann, M.J. Buhmann. Deterministic Annealing: fast physical heuristics for real time optimization of large systems, in: Proceedings of the 15th IMACS World Congress on Scientific Computation, Modeling and Applied Mathematics, Berlin, 1997.
[13] Ueda, N., SMEM algorithm for mixture models, Neural Comput., 12, 9, 2109-2128 (2000)
[14] Frigui, H.; Krishnapuram, R., Clustering by competitive agglomeration, Pattern Recognition, 30, 7, 1109-1119 (1997)
[15] S. Medasani, R. Krishnapuram. Categorization of image databases for efficient retrieval using robust mixture decomposition. IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Santa Barbara, 1998.; S. Medasani, R. Krishnapuram. Categorization of image databases for efficient retrieval using robust mixture decomposition. IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Santa Barbara, 1998. · Zbl 0999.68059
[16] M. Figueiredo, A.K. Jain, Unsupervised selection and estimation of finite mixture models, International Conference on Pattern Recognition—ICPR’2000, Barcelona, 2000.; M. Figueiredo, A.K. Jain, Unsupervised selection and estimation of finite mixture models, International Conference on Pattern Recognition—ICPR’2000, Barcelona, 2000.
[17] Xinxing, Y.; Licheng, J., Fast global optimization fuzzy-neural network and its application in data fusion, Proc. SPIE, 3545, 570-573 (1998)
[18] Roberts, S., Bayesian approaches to Guassian mixture modeling, IEEE Trans. Pattern Anal. Mach. Intell., 20, 11, 1133-1142 (1998)
[19] Figueiredo, M.; Leiato, J.; Jain, A. K., On fitting mixture models, (Hancock, E.; Pellilo, M., Energy Minimization Methods in Computer Vision and Pattern Recognition (1999), Springer: Springer Berlin), 45-69
[20] Banfield, J. D.; Raftery, A. E., Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803-821 (1993) · Zbl 0794.62034
[21] Fraley, C., Algorithms for model-based Gaussian hierarchical clustering, SIAM J. Sci. Comput., 20, 1, 270-281 (1998) · Zbl 0911.62052
[22] Brand, M., Structure learning in conditional probability models via entropic prior and parameter extinction, Neural Comput., 11, 1155-1182 (1999)
[23] Dave, R. N.; Krishnapuram, R., Robust clustering methods: a unified view, IEEE Trans. Fuzzy Systems, 5, 2, 270-293 (1997)
[24] Liu, C.; Rubin, D. B., ML estimation of the \(t\) distribution using EM and its extensions, ECM and ECME, Stat. Sinica, 5, 1, 19-39 (1995) · Zbl 0824.62047
[25] Huber, P. J., Robust Statistics (1982), Wiley: Wiley New York · Zbl 0555.62034
[26] Campbell, N. A., Mixture models and atypical values, Math. Geol., 16, 5, 465-477 (1984)
[27] Tadjudin, S.; Landgrebe, D. A., Robust parameter estimation for mixture model, IEEE Trans. Geosic. Remote Sens., 38, 1, 439-445 (2000)
[28] Meng, X. L.; van Dyk, D. A., The EM algorithm—an old folk-song sung to a fast new tune, J. R. Stat. Soc. Ser. B, 59, 3, 511-567 (1997) · Zbl 1090.62518
[29] Kent, J. T.; Tyler, D. E.; Vardi, Y., A curious likelihood identity for the multivariate \(t\)-distribution, Commun. Stat. Simulation Comput., 23, 2, 441-453 (1994) · Zbl 0825.62035
[30] Liu, C., ML estimation of the multivariate \(t\) distribution and the EM Algorithm, J. Multivar. Anal., 63, 2, 296-312 (1997) · Zbl 0884.62059
[31] Liu, C.; Rubin, D. B.; Wu, Y., Parameter expansion to accelerate EM: the PX-EM algorithm, Biometrika, 85, 4, 755-770 (1998) · Zbl 0921.62071
[32] Bezdek, J. C., Pattern recognition with fuzzy objective function algorithms (1981), Plenum Press: Plenum Press New York · Zbl 0503.68069
[33] Kloppenburg, M.; Tavan, P., Deterministic annealing for density estimation by multivariate normal mixtures, Phys. Rev. E, 55, 3, R2089-R2092 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.