×

Multiblock canonical correlation analysis for categorical variables: application to epidemiological data. (English) Zbl 1126.62045

Greenacre, Michael (ed.) et al., Multiple correspondence analysis and related methods. Selected papers based on the presentations at the international conference (CARME 2003), Barcelona, Spain, 29 June to 2 July 2003. Boca Raton, FL: Chapman & Hall/CRC (ISBN 1-58488-628-5/hbk). Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences Series, 393-404 (2006).
From the text: We have discussed a method for the analysis of several categorical variables with the purpose of predicting some outcome. The multiblock canonical correlation analysis is similar to performing multiple correspondence analysis (MCA) on the explanatory variables and then superimposing the variable to be predicted as a supplementary variable. However, in the method that we propose, the variable to be predicted plays an active role, ensuring that the major principal axes will be related to this variable.
We have also introduced a tuning parameter that offers a continuum approach ranging from MCA performed only on the explanatory variables to MCA performed on all variables. When \(\alpha\) increases, the role of the variable to be predicted is increasingly taken into account and, as a consequence, the first principal axes are likely to be related to this variable. Another feature of the method is that it can be extended to the case where it is desirable to predict more than one categorical variable from other categorical variables. Regarding the tuning parameter \(\alpha\), it is clear that, for a particular situation, it can be fixed by means of crossvalidation by seeking to maximize the number of correctly classified individuals. However, further research is needed to investigate its impact on the prediction ability. This research is currently underway in the form of a simulation study
For the entire collection see [Zbl 1198.62062].

MSC:

62H20 Measures of association (correlation, canonical correlation, etc.)
62P10 Applications of statistics to biology and medical sciences; meta analysis