×

Clusterwise elastic-net regression based on a combined information criterion. (English) Zbl 07702473

Summary: Many research questions pertain to a regression problem assuming that the population under study is not homogeneous with respect to the underlying model. In this setting, we propose an original method called Combined Information criterion CLUSterwise elastic-net regression (Ciclus). This method handles several methodological and application-related challenges. It is derived from both the information theory and the microeconomic utility theory and maximizes a well-defined criterion combining three weighted sub-criteria, each being related to a specific aim: getting a parsimonious partition, compact clusters for a better prediction of cluster-membership, and a good within-cluster regression fit. The solving algorithm is monotonously convergent, under mild assumptions. The Ciclus principle provides an innovative solution to two key issues: (i) the automatic optimization of the number of clusters, (ii) the proposal of a prediction model. We applied it to elastic-net regression in order to be able to manage high-dimensional data involving redundant explanatory variables. Ciclus is illustrated through both a simulation study and a real example in the field of omic data, showing how it improves the quality of the prediction and facilitates the interpretation. It should therefore prove useful whenever the data involve a population mixture as for example in biology, social sciences, economics or marketing.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H25 Factor analysis and principal components; correspondence analysis
91C20 Clustering in the social and behavioral sciences
Full Text: DOI

References:

[1] Ahonen, I.; Nevalainen, J.; Larocque, D., Prediction with a flexible finite mixture-of-regressions, Comput Stat Data Anal, 132, 212-224 (2019) · Zbl 1507.62004 · doi:10.1016/j.csda.2018.01.012
[2] Aldana-Bobadilla E, Kuri-Morales A (2015) A clustering method based on the maximum entropy principle. Entropy 151-180 · Zbl 1322.68166
[3] Baudat, G.; Anouar, F., Generalized discriminant analysis using a kernel approach, Neural Comput, 12, 2385-2404 (2000) · doi:10.1162/089976600300014980
[4] Beck, G.; Azzag, H.; Bougeard, S.; Lebbah, M.; Niang, N., A new micro-batch approach for partial least square clusterwise regression, Procedia Comput Sci, 144, 239-250 (2018) · doi:10.1016/j.procs.2018.10.525
[5] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE T Pattern Anal, 22, 719-725 (2000) · doi:10.1109/34.865189
[6] Biernacki C, Garcia-Escudero L, S I (2020) Special issue on innovations on model based clustering and classification. Adv Data Anal Classif 14(2):231-234 · Zbl 1496.00059
[7] Bock H (1969) The equivalence of two extremal problems and its application to the iterative classification of multivariate data. In: Vortragsausarbeitung, Tagung. Mathematisches Forschungsinstitut Oberwolfach
[8] Bougeard, S.; Abdi, H.; Saporta, G.; Niang, N., Clusterwise analysis for multiblock component methods, Adv Data Anal Classif, 12, 2, 285-313 (2017) · Zbl 1414.62231 · doi:10.1007/s11634-017-0296-8
[9] Bougeard, S.; Cariou, V.; Saporta, G.; Niang, N., Prediction for regularized clusterwise multiblock regression, Appl Stoch Model Bus, 34, 6, 852-867 (2018) · Zbl 1414.62230 · doi:10.1002/asmb.2335
[10] Brusco, M.; Cradit, J.; Taschian, A., Multicriterion clusterwise regression for joint segmentation settings: an application to customer value, J Mark Res, 40, 225-234 (2003) · doi:10.1509/jmkr.40.2.225.19227
[11] Brusco, M.; Cradit, J.; Steinley, D.; Fox, G., Cautionary remarks on the use of clusterwise regression, Multivar Behav Res, 43, 29-49 (2008) · doi:10.1080/00273170701836653
[12] Bry, X.; Verron, T.; Redont, P.; Cazes, P., THEME-SEER: a multidimensional exploratory technique to analyze a structural model using an extended covariance criterion, J Chemom, 26, 158-169 (2012) · doi:10.1002/cem.2425
[13] Bry X, Trottier C, Mortier F, Cornu T, Verron T (2016) Supervised component generalized linear regression with multiple explanatory blocks: THEME-SCGLR. In: Vinzi V, Russolillo G, Saporta G, Trinchera L, Abdi H (eds) The multiple facets of partial least squares and related methods, Springer proceedings in mathematics and statistics, pp 141-154 · Zbl 1366.62150
[14] Bushel P, Wolfinger R, Gibson G (2007) Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes. BMC Syst Biol 1-15
[15] Charles C (1977) Régression typologique et reconnaissance des formes. PhD thesis, University of Paris IX, France
[16] Charrad, M.; Ghazzali, N.; Boiteau, V.; Niknafs, A., Nbclust: an r package for determining the relevant number of clusters in a data set, J Stat Softw, 61, 1-36 (2014) · doi:10.18637/jss.v061.i06
[17] Cheng C, Fu A, Zhang Y (1999) Entropy-based subspace clustering for mining numerical data. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, USA, pp 84-93
[18] Cover T, Thomas J (2006) Elements of Information Theory, 2nd edn. Wiley · Zbl 1140.94001
[19] DeSarbo, W.; Cron, W., A maximum likelihood methodology for clusterwise linear regression, J Classif, 5, 249-282 (1988) · Zbl 0692.62052 · doi:10.1007/BF01897167
[20] DeSarbo, W.; Grisaffe, D., Combinatorial optimization approaches to constrained market segmentation: an application to industrial market segmentation, Mark Lett, 9, 115-134 (1998) · doi:10.1023/A:1007997714444
[21] Devijver, E., Finite mixture regression: a sparse variable selection by model selection for clustering, Electron J Stat, 9, 2642-2674 (2015) · Zbl 1329.62279 · doi:10.1214/15-EJS1082
[22] Diday E (1976) Classification et sélection de paramètres sous contraintes. Tech. rep, IRIA-LABORIA
[23] Friedman, J.; Hastie, T.; Tibshirani, R., Regularization paths for generalized linear models via coordinate descent, J Stat Softw, 33, 1, 1-22 (2010) · doi:10.18637/jss.v033.i01
[24] Gitman I, Chen J, Lei E, Dubrawski A (2018) Novel prediction techniques based on clusterwise linear regression. arXiv arXiv:1804.10742
[25] Heinloth A, Irwin R, Boorman G, Nettesheim P, Fannin R, Sieber S, Snell M, Tucker C, Li L, Travlos G, Vansant G, Blackshear P, Tennant R, Cunningham M, Paules R (2004) Gene expression profiling of rat livers reveals indicators of potential adverse effects. Toxicol Sci 80:193-202
[26] Heller, R.; Stanley, D.; Yekutieli, D.; Rubin, N.; Benjamini, Y., Cluster-based analysis of FMRI data, NeuroImage, 33, 599-608 (2006) · doi:10.1016/j.neuroimage.2006.04.233
[27] Hubert H, Arabie P (1985) Comparing partitions. J Classif 193-218 · Zbl 0587.62128
[28] Hwang, H.; DeSarbo, S.; Takane, Y., Fuzzy clusterwise generalized structured component analysis, Psychometrika, 72, 181-198 (2007) · Zbl 1286.62107 · doi:10.1007/s11336-005-1314-x
[29] Le Cao, K.; Rossouw, D.; Robert-Granie, C.; Besse, P., A sparse PLS for variable selection when integrating omics data, Stat Appl Genet Mol, 7, 1 (2008) · Zbl 1276.62061
[30] Leisch, F., FlexMix: a general framework for finite mixture models and latent class regression in R, J Stat Softw, 11, 1 (2004) · doi:10.18637/jss.v011.i08
[31] Mortier, F.; Ouedraogo, D.; Claeys, F.; Tadesse, M.; Cornu, G.; Baya, F.; Benedet, F.; Freycon, V.; Gourlet-Fleury, S.; Picard, N., Mixture of inhomogeneous matrix models for species-rich ecosystems, Environmetrics, 26, 39-51 (2015) · Zbl 1525.62184 · doi:10.1002/env.2320
[32] Nadaraya, E., On estimating regression. Theory of probability and its applications, Theory Probab Appl, 9, 141-142 (1964) · doi:10.1137/1109020
[33] Preda, C.; Saporta, G., Clusterwise PLS regression on a stochastic process, Comput Stat Data Anal, 49, 99-108 (2005) · Zbl 1429.62299 · doi:10.1016/j.csda.2004.05.002
[34] R Core Team (2017) R: A Language and Environment for Statistical Computing (version 3.6.1). R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[35] Rand, W., Objective criteria for the evaluation of clustering methods, J Am Stat Assoc, 66, 846-850 (1971) · doi:10.1080/01621459.1971.10482356
[36] Rohart F, Gautier B, Singh A, Le Cao KA (2017) mixomics: an r package for ’omics feature selection and multiple data integration. PLoS computational biology 13(11):e1005752
[37] Shannon C (1948) A mathematical theory of communication. L’Institut d’electronique et d’informatique Gaspard-Monge (Reprinted with corrections from The Bell System Technical Journal) 27:379-423 · Zbl 1154.94303
[38] Späth, H., Clusterwise linear regression, Computing, 22, 367-373 (1979) · Zbl 0387.65028 · doi:10.1007/BF02265317
[39] Stone, M., Cross-validatory choice and assessment of statistical predictions, J R Stat Soc B, 36, 111-147 (1974) · Zbl 0308.62063
[40] Suk, HW; Hwang, H., Regularized fuzzy clusterwise ridge regression, Adv Data Anal Classif, 4, 35-51 (2010) · Zbl 1306.62166 · doi:10.1007/s11634-009-0056-5
[41] Vinzi V, Lauro C, Amato S (2005) PLS typological regression. In: Monari P, Mignani S, Montanari A, Vichi M (eds) New developments in classification and data analysis. Springer, pp 133-140 · Zbl 1341.62207
[42] Vinzi, V.; Trinchera, L.; Squillacciotti, S.; Tenenhaus, M., REBUS-PLS: a response-based procedure for detecting unit segments in PLS path modeling, Appl Stochastic Models Bus Ind, 24, 439-458 (2009) · Zbl 1199.90018 · doi:10.1002/asmb.728
[43] Watson, G., Smooth regression analysis, Sankhya: Indian J Stat Ser A, 64, 359-372 (1964) · Zbl 0137.13002
[44] Wilderjans, T.; Ceulemans, E., Clusterwise Parafac to identify heterogeneity in three-way data, Chemometr Intell Lab, 129, 87-97 (2013) · doi:10.1016/j.chemolab.2013.09.010
[45] Wilderjans, T.; Vande Gaer, E.; Kiers, H.; Van Mechelen, I.; Ceulemans, E., Principal covariates clusterwise regression (PCCR): Accounting for multicollinearity and population heterogeneity in hierarchically organized data, Psychometrika, 82, 86-111 (2017) · Zbl 1360.62537 · doi:10.1007/s11336-016-9522-0
[46] Woo, CW; Krishnan, A.; Wager, T., Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations, Neuroimage, 91, 412-419 (2014) · doi:10.1016/j.neuroimage.2013.12.058
[47] Xiang, S.; Yao, W., Semi parametric mixtures of regressions with single-index for model based clustering, Adv Data Anal Classif, 14, 261-292 (2020) · Zbl 1474.62137 · doi:10.1007/s11634-020-00392-w
[48] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J R Stat Soc B, 68, 49-67 (2005) · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[49] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J R Stat Soc B, 67, 301-320 (2005) · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.