×

The robust EM-type algorithms for log-concave mixtures of regression models. (English) Zbl 1464.62092

Summary: Finite mixture of regression (FMR) models can be reformulated as incomplete data problems and they can be estimated via the expectation-maximization (EM) algorithm. The main drawback is the strong parametric assumption such as FMR models with normal distributed residuals. The estimation might be biased if the model is misspecified. To relax the parametric assumption about the component error densities, a new method is proposed to estimate the mixture regression parameters by only assuming that the components have log-concave error densities but the specific parametric family is unknown.
Two EM-type algorithms for the mixtures of regression models with log-concave error densities are proposed. Numerical studies are made to compare the performance of our algorithms with the normal mixture EM algorithms. When the component error densities are not normal, the new methods have much smaller MSEs when compared with the standard normal mixture EM algorithms. When the underlying component error densities are normal, the new methods have comparable performance to the normal EM algorithm.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G07 Density estimation
Full Text: DOI

References:

[1] Balabdaoui, Fadoua, Doss, Charles R., 2014. Inference for a mixture of symmetric distributions under log-concavity. arXiv preprint:1411.4708. · Zbl 1419.62059
[2] Balabdaoui, Fadoua; Rufibach, Kaspar; Wellner, Jon A., Limit distribution theory for maximum likelihood estimation of a log-concave density, Ann. Statist., 37, 3, 1299-1331, (2009) · Zbl 1160.62008
[3] Bartolucci, Francesco; Scaccia, Luisa., The use of mixtures for dealing with non-normal regression errors, Comput. Statist. Data Anal., 48, 4, 821-834, (2005) · Zbl 1429.62284
[4] Benaglia, Tatiana; Chauveau, Didier; Hunter, David; Young, Derek., Mixtools: an R package for analyzing finite mixture models, J. Stat. Softw., 32, 6, 1-29, (2009)
[5] Chang, George T; Walther, Guenther., Clustering with mixtures of log-concave distributions, Comput. Statist. Data Anal., 51, 12, 6242-6251, (2007) · Zbl 1445.62141
[6] Chen, Yining; Samworth, Richard J., Smoothed log-concave maximum likelihood estimation with applications, Statist. Sinica, 23, 3, 1373-1398, (2013) · Zbl 1534.62045
[7] Cohen, Elizabeth A., Inharmonic tone perception, (1980), Stanford University, unpublished
[8] Cule, Madeleine; Gramacy, Robert; Samworth, Richard., Logconcdead: an R package for maximum likelihood estimation of a multivariate log-concave density, J. Stat. Softw., 29, 2, 1-20, (2009)
[9] Cule, Madeleine; Samworth, Richard., Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density, Electron. J. Stat., 4, 254-270, (2010) · Zbl 1329.62183
[10] Cule, Madeleine; Samworth, Richard; Stewart, Michael., Maximum likelihood estimation of a multi-dimensional log-concave density, J. R. Stat. Soc. Ser. B Stat. Methodol., 72, 5, 545-607, (2010) · Zbl 1329.62183
[11] Dempster, Arthur P.; Laird, Nan M.; Rubin, Donald B., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol., 1-38, (1977) · Zbl 0364.62022
[12] Dümbgen, Lutz; Rufibach, Kaspar, Maximum likelihood estimation of a log-concave density and its distribution function: basic properties and uniform consistency, Bernoulli, 15, 1, 40-68, (2009) · Zbl 1200.62030
[13] Dümbgen, Lutz; Samworth, Richard; Schuhmacher, Dominic., Approximation by log-concave distributions, with applications to regression, Ann. Statist., 39, 2, 702-730, (2011) · Zbl 1216.62023
[14] Dümbgen, Lutz; Samworth, Richard J.; Schuhmacher, Dominic., Stochastic search for semiparametric linear regression models, (From Probability to Statistics and Back: High-Dimensional Models and Processes-A Festschrift in Honor of Jon A. Wellner, (2013), Institute of Mathematical Statistics), 78-90 · Zbl 1327.62204
[15] Everitt, Brian S.; Hand, David J., Finite mixture distributions, vol. 9, (1981), Chapman and Hall London · Zbl 0466.62018
[16] Frühwirth-Schnatter, Sylvia., Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models, J. Amer. Statist. Assoc., 96, 453, 194-209, (2001) · Zbl 1015.62022
[17] Galimberti, Giuliano; Soffritti, Gabriele., A multivariate linear regression analysis using finite mixtures of \(t\) distributions, Comput. Statist. Data Anal., 71, 138-150, (2014) · Zbl 1471.62070
[18] García-Escudero, Luis Angel; Gordaliza, Alfonso; San Martin, Roberto; Van Aelst, Stefan; Zamar, Ruben., Robust linear clustering, J. R. Stat. Soc. Ser. B Stat. Methodol., 71, 1, 301-318, (2009) · Zbl 1231.62112
[19] Grün, Bettina; Hornik, Kurt., Modelling human immunodeficiency virus ribonucleic acid levels with finite mixtures for censored longitudinal data, J. R. Stat. Soc. Ser. C. Appl. Stat., 61, 2, 201-218, (2012)
[20] Hu, Hao; Wu, Yichao; Yao, Weixin., Maximum likelihood estimation of the mixture of log-concave densities, Comput. Statist. Data Anal., 101, 137-147, (2016) · Zbl 1466.62105
[21] Hunter, David R.; Young, Derek S., Semiparametric mixtures of regressions, J. Nonparametr. Stat., 24, 1, 19-38, (2012) · Zbl 1241.62055
[22] Ingrassia, Salvatore; Minotti, Simona C.; Punzo, Antonio., Model-based clustering via linear cluster-weighted models, Comput. Statist. Data Anal., 71, 159-182, (2014) · Zbl 1471.62095
[23] Ingrassia, Salvatore; Punzo, Antonio; Vittadini, Giorgio; Minotti, Simona C., The generalized linear mixed cluster-weighted model, J. Classification, 32, 1, 85-113, (2015) · Zbl 1331.62310
[24] Lachos, Victor H.; Bandyopadhyay, Dipankar; Garay, Aldo M., Heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, Statist. Probab. Lett., 81, 8, 1208-1217, (2011) · Zbl 1219.62111
[25] Liang, Faming., Clustering gene expression profiles using mixture model ensemble averaging approach, JP J. Biostatistics, 2, 57-80, (2008) · Zbl 1274.62045
[26] Lin, Tsung-I., Robust mixture modeling using multivariate skew \(t\) distributions, Stat. Comput., 20, 3, 343-356, (2010)
[27] Lin, Tsung-I; Lee, Jack C.; Yen, Shu Y., Finite mixture modelling using the skew normal distribution, Statist. Sinica, 17, 3, 909-927, (2007) · Zbl 1133.62012
[28] Lindsay, Bruce G., Mixture models: theory, geometry and applications, (NSF-CBMS Regional Conference Series in Probability and Statistics, Vol. 5, (1995), JSTOR), i-163 · Zbl 1163.62326
[29] Liu, Min; Lin, Tsung-I., A skew-normal mixture regression model, Educ. Psychol. Meas., 74, 1, 139-162, (2014)
[30] McLachlan, Geoffrey; Krishnan, Thriyambakam., The EM algorithm and extensions, vol. 382, (2007), John Wiley & Sons · Zbl 1165.62019
[31] McLachlan, Geoffrey; Peel, David., Finite mixture models, (2000), John Wiley & Sons · Zbl 0963.62061
[32] Neykov, Neyko; Filzmoser, Peter; Dimova, R.; Neytchev, Plamen., Robust Fitting of mixtures using the trimmed likelihood estimator, Comput. Statist. Data Anal., 52, 1, 299-308, (2007) · Zbl 1328.62033
[33] Plataniotis, Kostantinos N., Gaussian mixtures and their applications to signal processing, (Advanced Signal Processing Handbook: Theory and Implementation for Radar, Sonar, and Medical Imaging Real Time Systems, (2000)) · Zbl 1161.94300
[34] Punzo, Antonio; Ingrassia, Salvatore., Clustering bivariate mixed-type data via the cluster-weighted model, Comput. Statist., 31, 3, 989-1013, (2016) · Zbl 1347.65030
[35] Punzo, Antonio, McNicholas, Paul D., 2014. Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. arXiv preprint:1409.6019. · Zbl 1373.62316
[36] Rousseeuw, Peter J., Multivariate estimation with high breakdown point, Math. Stat. Appl., 8, 283-297, (1985) · Zbl 0609.62054
[37] Rufibach, Kaspar., Computing maximum likelihood estimators of a log-concave density function, J. Stat. Comput. Simul., 77, 7, 561-574, (2007) · Zbl 1146.62027
[38] Song, Weixing; Yao, Weixin; Xing, Yanru., Robust mixture regression model Fitting by Laplace distribution, Comput. Statist. Data Anal., 71, 128-137, (2014) · Zbl 1471.62189
[39] Stephens, Matthew., Dealing with label switching in mixture models, J. R. Stat. Soc. Ser. B Stat. Methodol., 62, 4, 795-809, (2000) · Zbl 0957.62020
[40] Verkuilen, Jay; Smithson, Michael., Mixed and mixture regression models for continuous bounded responses using the beta distribution, J. Educ. Behav. Stat., 37, 1, 82-113, (2012)
[41] Walther, Guenther., Detecting the presence of mixing with multiscale maximum likelihood, J. Amer. Statist. Assoc., 97, 458, 508-513, (2002) · Zbl 1073.62533
[42] Wang, Shaoli, Yao, Weixin, Hunter, David., 2012. Mixtures of linear regression models with unknown error density. Unpublished manuscript.
[43] Wu, Qiang; Yao, Weixin., Mixtures of quantile regressions, Comput. Statist. Data Anal., 93, 162-176, (2016) · Zbl 1468.62211
[44] Yao, Weixin., Label switching and its solutions for frequentist mixture models, J. Stat. Comput. Simul., 85, 5, 1000-1012, (2015) · Zbl 1457.62030
[45] Yao, Weixin; Lindsay, Bruce G., Bayesian mixture labeling by highest posterior density, J. Amer. Statist. Assoc., 104, 486, 758-767, (2009) · Zbl 1388.62007
[46] Yao, Weixin; Wei, Yan; Yu, Chun., Robust mixture regression using the \(t\)-distribution, Comput. Statist. Data Anal., 71, 116-127, (2014) · Zbl 1471.62227
[47] Zeller, Camil B.; Lachos, Víctor H.; Vilca-Labra, Filidor E., Local influence analysis for regression models with scale mixtures of skew-normal distributions, J. Appl. Stat., 38, 2, 343-368, (2011) · Zbl 1511.62164
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.