×

Modeling differences in the dimensionality of multiblock data by means of clusterwise simultaneous component analysis. (English) Zbl 1288.62179

Summary: Given multivariate multiblock data (e.g., subjects nested in groups are measured on multiple variables), one may be interested in the nature and number of dimensions that underlie the variables, and in differences in dimensional structure across data blocks. To this end, clusterwise simultaneous component analysis (SCA) was proposed which simultaneously clusters blocks with a similar structure and performs an SCA per cluster. However, the number of components was restricted to be the same across clusters, which is often unrealistic. In this paper, this restriction is removed. The resulting challenges with respect to model estimation and selection are resolved.

MSC:

62P15 Applications of statistics to psychology
Full Text: DOI

References:

[1] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723. · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[2] Barrett, L.F. (1998). Discrete emotions or dimensions? The role of valence focus and arousal focus. Cognition and Emotion, 12, 579–599. · doi:10.1080/026999398379574
[3] Brusco, M.J., & Cradit, J.D. (2001). A variable selection heuristic for K-means clustering. Psychometrika, 66, 249–270. · Zbl 1293.62237 · doi:10.1007/BF02294838
[4] Brusco, M.J., & Cradit, J.D. (2005). ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses. Journal of Mathematical Psychology, 49, 142–154. · Zbl 1110.91021 · doi:10.1016/j.jmp.2004.11.004
[5] Cattell, R.B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245–276. · doi:10.1207/s15327906mbr0102_10
[6] Ceulemans, E., & Kiers, H.A.L. (2006). Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. British Journal of Mathematical & Statistical Psychology, 59, 133–150. · doi:10.1348/000711005X64817
[7] Ceulemans, E., & Kiers, H.A.L. (2009). Discriminating between strong and weak structures in three-mode principal component analysis. British Journal of Mathematical & Statistical Psychology, 62, 601–620. · doi:10.1348/000711008X369474
[8] Ceulemans, E., Timmerman, M.E., & Kiers, H.A.L. (2011). The CHULL procedure for selecting among multilevel component solutions. Chemometrics and Intelligent Laboratory Systems, 106, 12–20. · doi:10.1016/j.chemolab.2010.08.001
[9] Ceulemans, E., & Van Mechelen, I. (2005). Hierarchical classes models for three-way three-mode binary data: interrelations and model selection. Psychometrika, 70, 461–480. · Zbl 1306.62392 · doi:10.1007/s11336-003-1067-3
[10] Cohen, J. (1973). Eta-squared and partial eta-squared in fixed factor ANOVA designs. Educational and Psychological Measurement, 33, 107–112. · doi:10.1177/001316447303300111
[11] De Roover, K., Ceulemans, E., & Timmerman, M.E. (2012a). How to perform multiblock component analysis in practice. Behavior Research Methods, 44, 41–56. · doi:10.3758/s13428-011-0129-1
[12] De Roover, K., Ceulemans, E., Timmerman, M.E., & Onghena, P. (2012b). A clusterwise simultaneous component method for capturing within-cluster differences in component variances and correlations. British Journal of Mathematical & Statistical Psychology. doi: 10.1111/j.2044-8317.2012.02040.x . Advance online publication.
[13] De Roover, K., Ceulemans, E., Timmerman, M.E., Vansteelandt, K., Stouten, J., & Onghena, P. (2012c). Clusterwise simultaneous component analysis for the analysis of structural differences in multivariate multiblock data. Psychological Methods, 17, 100–119. · doi:10.1037/a0025385
[14] Diaz-Loving, R. (1998). Contributions of Mexican ethnopsychology to the resolution of the etic-emic dilemma in personality. Journal of Cross-Cultural Psychology, 29, 104–118. · doi:10.1177/0022022198291006
[15] Feningstein, A., Scheier, M.F., & Buss, A. (1975). Public and private self-consciousness. Journal of Consulting and Clinical Psychology, 43, 522–527. · doi:10.1037/h0076760
[16] Goldberg, L.R. (1990). An alternative ”description of personality”: the Big-Five factor structure. Journal of Personality and Social Psychology, 59, 1216–1229. · doi:10.1037/0022-3514.59.6.1216
[17] Hands, S., & Everitt, B. (1987). A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques. Multivariate Behavioral Research, 22, 235–243. · doi:10.1207/s15327906mbr2202_6
[18] Hoerl, A.E. (1962). Application of ridge analysis to regression problems. Chemical Engineering Progress, 58, 54–59.
[19] Hofmans, J., Ceulemans, E., Steinley, D., & Van Mechelen, I. (2012). On the added value of bootstrap analysis for K-means clustering. Manuscript conditionally accepted. · Zbl 1335.62099
[20] Jolliffe, I.T. (1986). Principal component analysis. New York: Springer. · Zbl 0584.62009
[21] Kaiser, H.F. (1958). The Varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200. · Zbl 0095.33603 · doi:10.1007/BF02289233
[22] Kiers, H.A.L. (1990). SCA. A program for simultaneous components analysis of variables measured in two or more populations. Groningen: iec ProGAMMA.
[23] Kiers, H.A.L., & ten Berge, J.M.F. (1994). Hierarchical relations between methods for Simultaneous Components Analysis and a technique for rotation to a simple simultaneous structure. British Journal of Mathematical & Statistical Psychology, 47, 109–126. · Zbl 0825.62512 · doi:10.1111/j.2044-8317.1994.tb01027.x
[24] McLachlan, G.J., & Peel, D. (2000). Finite mixture models. New York: Wiley. · Zbl 0963.62061
[25] Meredith, W., & Millsap, R.E. (1985). On component analyses. Psychometrika, 50, 495–507. · Zbl 0609.62097 · doi:10.1007/BF02296266
[26] Milligan, G.W., Soon, S.C., & Sokol, L.M. (1983). The effect of cluster size, dimensionality, and the number of clusters on recovery of true cluster structure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5, 40–47. · doi:10.1109/TPAMI.1983.4767342
[27] Nezlek, J.B. (2005). Distinguishing affective and non-affective reactions to daily events. Journal of Personality, 73, 1539–1568. · doi:10.1111/j.1467-6494.2005.00358.x
[28] Nezlek, J.B. (2012). Diary methods for social and personality psychology. In J.B. Nezlek (Ed.), The SAGE library in social and personality psychology methods. London: Sage Publications.
[29] Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572. · JFM 32.0710.04 · doi:10.1080/14786440109462720
[30] Robert, P., & Escoufier, Y. (1976). A unifying tool for linear multivariate statistical methods: the RV-coefficient. Applied Statistics, 25, 257–265. · doi:10.2307/2347233
[31] Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[32] Selim, S.Z., & Ismail, M.A. (1984). K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 81–87. · Zbl 0546.62037 · doi:10.1109/TPAMI.1984.4767478
[33] Smilde, A.K., Kiers, H.A.L., Bijlsma, S., Rubingh, C.M., & van Erk, M.J. (2009). Matrix correlations for high-dimensional data: the modified RV-coefficient. Bioinformatics, 25, 401–405. · doi:10.1093/bioinformatics/btn634
[34] Steinley, D. (2003). Local optima in K-means clustering: what you don’t know may hurt you. Psychological Methods, 8, 294–304. · doi:10.1037/1082-989X.8.3.294
[35] ten Berge, J.M.F. (1993). Least squares optimization in multivariate analysis. Leiden: DSWO Press. · Zbl 0937.62542
[36] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B (Methodological), 58, 267–288. · Zbl 0850.62538
[37] Timmerman, M.E., Ceulemans, E., Kiers, H.A.L., & Vichi, M. (2010). Factorial and reduced K-means reconsidered. Computational Statistics & Data Analysis, 54, 1858–1871. · Zbl 1284.62396 · doi:10.1016/j.csda.2010.02.009
[38] Timmerman, M.E., & Kiers, H.A.L. (2000). Three-mode principal component analysis: choosing the numbers of components and sensitivity to local optima. British Journal of Mathematical & Statistical Psychology, 53, 1–16. · doi:10.1348/000711000159132
[39] Timmerman, M.E., & Kiers, H.A.L. (2003). Four simultaneous component models of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika, 86, 105–122. · Zbl 1306.62507 · doi:10.1007/BF02296656
[40] Timmerman, M.E., Kiers, H.A.L., Smilde, A.K., Ceulemans, E., & Stouten, J. (2009). Bootstrap confidence intervals in multi-level simultaneous component analysis. British Journal of Mathematical & Statistical Psychology, 62, 299–318. · doi:10.1348/000711007X265894
[41] Trapnell, P.D., & Campbell, J.D. (1999). Private self-consciousness and the five factor model of personality: distinguishing rumination from reflection. Journal of Personality and Social Psychology, 76, 284–304. · doi:10.1037/0022-3514.76.2.284
[42] Tugade, M.M., Fredrickson, B.L., & Barrett, L.F. (2004). Psychological resilience and positive emotional granularity: examining the benefits of positive emotions on coping and health. Journal of Personality, 72, 1161–1190. · doi:10.1111/j.1467-6494.2004.00294.x
[43] Van Deun, K., Wilderjans, T.F., van den Berg, R.A., Antoniadis, A., & Van Mechelen, I. (2011). A flexible framework for sparse simultaneous component based data integration. BMC Bioinformatics, 12, 448. · doi:10.1186/1471-2105-12-448
[44] Van Mechelen, I., & Smilde, A.K. (2010). A generic linked-mode decomposition model for data fusion. Chemometrics and Intelligent Laboratory Systems, 104, 83–94. doi: 10.1016/j.chemolab.2010.04.012 . · doi:10.1016/j.chemolab.2010.04.012
[45] Wilderjans, T.F., Ceulemans, E., Van Mechelen, I., & van den Berg, R.A. (2011). Simultaneous analysis of coupled data matrices subject to different amounts of noise. British Journal of Mathematical & Statistical Psychology, 64, 277–290. · doi:10.1348/000711010X513263
[46] Yung, Y.F. (1997). Finite mixtures in confirmatory factor-analysis models. Psychometrika, 62, 297–330. · Zbl 0890.62047 · doi:10.1007/BF02294554
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.