×

The intra-cluster correlation coefficient in cluster randomized trials: a review of definitions. (English) Zbl 07882319

Summary: The intra-cluster correlation coefficient (ICC) of the primary outcome plays a key role in the design and analysis of cluster randomized trials (CRTs), but the precise definition of this parameter is somewhat elusive, especially in the context of non-normally distributed outcomes. In this paper, we provide a unified treatment of ICC as used in CRTs. We present a general definition of the ICC that may be expressed in different ways depending on the modelling approach used to describe the data, illustrating how this general definition is applied to continuous and dichotomous outcomes. Greater complexity arises for dichotomous outcomes; in particular, the usual definition of the ICC cannot be related directly to the parameters of the logistic-normal model that is commonly used for dichotomous outcomes. We show how the definition of the ICC is different when covariates are introduced. Finally, we use our framework and definition of the ICC to draw out implications for those interpreting and choosing values of the ICC when planning CRTs.
© 2009 The Authors. Journal compilation © 2009 International Statistical Institute

MSC:

62Pxx Applications of statistics
62Hxx Multivariate analysis
62Jxx Linear inference, regression

Software:

MLwiN; WinBUGS; Stata
Full Text: DOI

References:

[1] Adams, G., Gulliford, M.C., Ukoumunne, O.C., Eldridge, S., Chinn, S. & Campbell, M.J. (2004). Patterns of intra‐cluster correlation from primary care research to inform study design and analysis. J. Clin. Epidemiol., 57(8), 785-794.
[2] Bodian, C.A. (1994). Intraclass correlation for 2‐By‐2 tables under 3 sampling designs. Biometrics, 50(1), 183-193. · Zbl 0825.62762
[3] Campbell, M.K., Fayers, P.M. & Grimshaw, J.M. (2005). Determinants of the intracluster correlation coefficient in cluster randomized trials: The case of implementation research. Clin. Trials, 2(2), 99-107.
[4] Commenges, D. & Jacquin, H. (1994). The intraclass correlation‐coefficient - Distribution‐free definition and test. Biometrics, 50(2), 517-526. · Zbl 0821.62029
[5] Cosby, R.H., Howard, M., Kaczorowski, J., Willan, A.R. & Sellors, J.W. (2003). Randomizing patients by family practice: sample size estimation, intracluster correlation and data analysis. Fam. Pract., 20(1), 77-82.
[6] Crump, J.A., Otieno, P.O., Slutsker, L., Keswick, B.H., Rosen, D.H., Hoekstra, R.M., Vulule, J.B. & Luby, S.P. (2005). Household based treatment of drinking water with flocculant‐disinfectant for preventing diarrhoea in areas with turbid source water in rural western Kenya: Cluster randomised controlled trial. British Med. J., 331(7515), 478.
[7] Donald, A. & Donner, A. (1987). Adjustments to the Mantel‐Haenszel chi‐square statistic and odds ratio variance estimator when the data are clustered. Stat. Med., 6(4), 491-499.
[8] Donner, A. (1986). A review of inference procedures for the intraclass correlation‐coefficient in the one‐way random effects model. Internat. Statist. Rev., 54(1), 67-82. · Zbl 0587.62141
[9] Donner, A. (1987). Adjustments to the Mantel-Haenszel chi‐square statistic and odds ratio variance estimator when the data are clustered. Stat. Med., 6(4), 491-499.
[10] Donner, A. & Klar, N. (1994). Cluster randomization trials in epidemiology - Theory and application. J. Statist. Plann. Inference, 42(1-2), 37-56. · Zbl 0825.62897
[11] Donner, A. & Klar, N. (2000). Design and Analysis of Cluster Randomised Trials in Health Research. London : Arnold.
[12] Donner, A. & Koval, J.J. (1980). Estimation of intra‐class correlation in the analysis of family data. Biometrics, 36(1), 19-25. · Zbl 0422.62092
[13] Donner, A. & Wells, G. (1986). A comparison of confidence‐interval methods for the intraclass correlation‐coefficient. Biometrics, 42(2), 401-412. · Zbl 0654.62089
[14] Eldridge, S.M., Ashby, D. & Kerry, S. (2006). Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int. J. Epidemiol., 35(5), 1292-1300.
[15] Elley, C.R., Kerse, N., Chondros, P. & Robinson, E. (2005). Intraclass correlation coefficients from three cluster randomised controlled trials in primary and residential health care. Aust. N. Z. J. Pub. Health, 29(5), 461-467.
[16] Evans, B.A., Feng, Z. & Peterson, A.V. (2001). A comparison of generalized linear mixed model procedures with estimating equations for variance and covariance parameter estimation in longitudinal studies and group randomized trials. Stat. Med., 20(22), 3353-3373.
[17] Ferrinho, P., Valli, A., Groeneveld, T., Buch, E. & Coetzee, D. (1992). The effects of cluster sampling in an African urban setting. J. Med., 38, 324-330.
[18] Fleiss, J.L. (1981). Statistical Methods for Rates and Proportions, 2nd ed. New York : Wiley. · Zbl 0544.62002
[19] Giraudeau, B. (2006). Model mis‐specification and overestimation of the intraclass correlation coefficient in cluster randomized trials. Stat. Med., 25(6), 957-964.
[20] Goldstein, H. (1995). Multilevel Statistical Models. London : Arnold.
[21] Goldstein, H., Browne, W. & Rasbash, J. (2002). Partitioning variation in multilevel models. Underst. Stat., 1, 223-232.
[22] Griffiths, C., Foster, G., Barnes, N., Eldridge, S., Tate, H., Begum, S., Wiggins, M., Dawson, C., Livingstone, A.E., Chambers, M., Coats, T., Harris, R. & Feder, G.S. (2004). Specialist nurse intervention to reduce unscheduled asthma care in a deprived multiethnic area: the east London randomised controlled trial for high risk asthma (ELECTRA). British Med. J., 328(7432), 144.
[23] Gulliford, M.C., Adams, G., Ukoumunne, O.C., Latinovic, R., Chinn, S. & Campbell, M.J. (2005). Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. J. Clin. Epidemiol., 58(3), 246-251.
[24] Gulliford, M.C., Ukoumunne, O.C. & Chinn, S. (1999). Components of variance and intraclass correlations for the design of community‐based surveys and intervention studies: data from the Health Survey for England 1994. Amer. J. Epidemiol., 149(9), 876-883.
[25] Harris, J.A. (1913). On the calculation of intra‐class and inter‐class coefficients of correlation from class moments when the number of possible combinations is large. Biometrika, 9(3/4), 446-472.
[26] Harville, D.A. (1997). Matrix Algebra from a Statistician’s Perspective. New York : Springer. · Zbl 0881.15001
[27] Kang, D.W., Schwartz, J.B. & Verotta, D. (2004). A sample size computation method for non‐linear mixed effects models with applications to pharmacokinetics models. Stat. Med., 23(16), 2551-2566.
[28] Karlin, S., Cameron, E.C. & Williams, P.T. (1981). Sibling and parent‐offspring correlation estimation with variable family size. Proc. Nat. Acad. Sci. U.S.A., 78(5), 2664-2668. · Zbl 0489.62095
[29] Katz, J., Carey, V.J., Zeger, S.L. & Sommer, A. (1993). Estimation of design effects and diarrhea clustering within households and villages. Amer. J. Epidemiol., 138(11), 994-1006.
[30] Kish, L. (1965). Survey Sampling. New York : John Wiley. · Zbl 0151.23403
[31] Lee, D. & Dubin, N. (1994). Estimation and sample size considerations for clustered binary responses. Stat. Med., 13, 1241-1252.
[32] Liang, K.Y. & Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13-22. · Zbl 0595.62110
[33] Mak, T.K. (1988). Analyzing intraclass correlation for dichotomous‐variables. Appl. Stat. J. Roy. Stat. Soc. Ser. C, 37(3), 344-352.
[34] Molenberghs, G., Fitzmaurice, G.M. & Lipsitz, S.R. (1996). Efficicent estimation of the intraclass correlation for a binary trait. J. Agric. Biol. Environ. Sci., 1(1), 78-96.
[35] Muller, R. & Buttner, P. (1994). A critical discussion of intraclass correlation‐coefficients. Stat. Med., 13(23-24), 2465-2476.
[36] Neuhaus, J.M. (1993). Estimation efficiency and tests of covariate effects with clustered binary data. Biometrics, 49(4), 989-996.
[37] Neuhaus, J.M. & Jewell, N.P. (1990). Some comments on Rosner’s multiple logistic model for clustered data. Biometrics, 46(2), 523-531.
[38] Neuhaus, J.M., Kalbfleisch, J.D. & Hauck, W.W. (1991). A comparison of cluster‐specific and population‐averaged approaches for analyzing correlated binary data. Internat. Statist. Rev., 59(1), 25-35.
[39] Pal, N. & Lim, W.K. (2004) On intra‐class correlation coefficient estimation. Statist. Papers, 45(3), 369-392. · Zbl 1048.62030
[40] Parker, D.R., Evangelou, E. & Eaton, C.B. (2005). Intraclass correlation coefficients for cluster randomized trials in primary care: The cholesterol education and research trial (CEART). Contemp. Clin. Trials, 26(2), 260-267.
[41] Prentice, R.L. (1988). Correlated binary regression with covariates specific to each binary observation. Biometrics, 44(4), 1033-1048. · Zbl 0715.62145
[42] Rasbash, J., Browne, W., Goldstein, H., Yang, M., Plewis, I., Healy, M., Woodhouse, G., Draper, D., Langford, I. & Lewis, T. (2000). A User’s Guide to MLwiN. London : Institute of Education.
[43] Ridout, M.S., Demetrio, C.G.B. & Firth, D. (1999). Estimating intraclass correlation for binary data. Biometrics, 55(1), 137-148. · Zbl 1059.62601
[44] Schemper, M. (1986). General derivation of intraclass correlation‐coefficients. Biom. J., 28(4), 485-489. · Zbl 0607.62063
[45] Searle, S.R. (1971). Topics in variance component estimation. Biometrics, 27, 1-76.
[46] Searle, S.R., Casella, G. & McCulloch, C.E. (1992). Variance Components. New York : Wiley. · Zbl 0850.62007
[47] Sham, P. (1998). Statistics in Human Genetics. London : Arnold. · Zbl 0895.62109
[48] Siddiqui, O., Hedeker, D., Flay, B.R. & Hu, F.B. (1996). Intraclass correlation estimates in a school‐based smoking prevention study - Outcome and mediating variables, by sex and ethnicity. Amer. J. Epidemiol., 144(4), 425-433.
[49] Snjiders, T. & Bosker, R. (1999). Multilevel Analysis; an Introduction to Basic and Advanced Multilevel Modelling. London : Sage. · Zbl 0953.62127
[50] Spiegelhalter, D., Thomas, A. & Best, N. WinBUGS Version 1.4. Available at http://www.mrc‐bsu.cam.ac.uk.
[51] StataCorp. (2007). Stata Statistical Software: Release 10. College Station , TX : StataCorp LLP.
[52] Streiner, D.L. & Norman, G.R. (2003). Health Measurement Scales, 3rd ed. Oxford : Oxford University Press.
[53] Turner, R.M., Omar, R.Z. & Thompson, S.G. (2001). Bayesian methods of analysis for cluster randomized trials with binary outcome data. Stat. Med., 20(3), 453-472.
[54] Turner, R.M., Omar, R.Z. & Thompson, S.G. (2006). Constructing intervals for the intracluster correlation coefficient using Bayesian modelling, and application in cluster randomized trials. Stat. Med., 25(9), 1443-1456.
[55] Werner, J. (1977). Computer‐program for calculating different kinds of intraclass correlation‐coefficients. Comput. Programs Biomed., 7(2), 125-127.
[56] Zou, K.H. & Normand, S.L. (2001). On determination of sample size in hierarchical binomial models. Stat. Med., 20(14), 2163-2182.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.