Abstract
In a nested case–control study, controls are selected for each case from the individuals who are at risk at the time at which the case occurs. We say that the controls are matched on study time. To adjust for possible confounding, it is common to match on other variables as well. The standard analysis of nested case–control data is based on a partial likelihood which compares the covariates of each case to those of its matched controls. It has been suggested that one may break the matching of nested case–control data and analyse them as case–cohort data using an inverse probability weighted (IPW) pseudo likelihood. Further, when some covariates are available for all individuals in the cohort, multiple imputation (MI) makes it possible to use all available data in the cohort. In the paper we review the standard method and the IPW and MI approaches, and compare their performance using simulations that cover a range of scenarios, including one and two endpoints.
Similar content being viewed by others
References
Aalen OO, Borgan Ø, Gjessing HK (2008) Survival and event history analysis: a process point of view. Springer, New York
Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120
Bartlett JW, Seaman SR, White IR, Carpenter JR (2014) Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Stat Methods Med Res. doi:10.1177/0962280214521348
Borgan ��, Samuelsen SO (2013) Nested case–control and case–cohort studies. In: Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH (eds) Handbook of survival analysis. Chapman and Hall/CRC Press, Boca Raton, Florida, pp 343–367
Borgan Ø, Goldstein L, Langholz B (1995) Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat 23:1749–1778
Breslow NE (1996) Statistics in epidemiology: the case–control study. J American Stat Assoc 91:14–28
Carpenter JR, Kenward MG (2013) Multiple imputation and its aplication. Wiley, New York
Chen K (2001) Generalized case–cohort estimation. J R Stat Soc Ser B 63:791–809
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, Hoboken
Keogh RH, Cox DR (2014) Case–control studies. Cambridge University Press, Cambridge
Keogh RH, White IR (2013) Using full-cohort data in nested case–control and case–cohort studies by multiple imputation. Stat Med 32:4021–4043
Langholz B, Borgan Ø (1995) Counter-matching: a stratified nested case–control sampling method. Biometrika 82:69–79
Meng X (1994) Multiple-imputation inferences with uncongenial sources of input. Stat Sci 9:538–558
Oakes D (1981) Survival times: aspects of partial likelihood (with discussion). Int Stat Rev 49:235–264
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Rundle AG, Vineis P, Ahsan H (2005) Design options for molecular epidemiology research within cohort studies. Cancer Epidemiol Biomark Prev 14:1899–1907
Saarela O, Kulathinal S, Arjas E, Läärä E (2008) Nested case–control data utilized for multiple outcomes: a likelihood approach and alternatives. Stat Med 27:5991–6008
Samuelsen SO (1997) A pseudolikelihood approach to analysis of nested case–control studies. Biometrika 84:379–394
Samuelsen SO, Ånestad H, Skrondal A (2007) Stratified case–cohort analysis of general cohort sampling designs. Scand J Stat 34:103–119
Scheike TH, Juul A (2004) Maximum likelihood estimation for Cox’s regression model under nested case–control sampling. Biostatistics 5:193–206
Scott AJ, Wild CJ (1986) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 48:170–182
Scott AJ, Wild CJ (2002) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 64:207–219
Støer NC, Samuelsen SO (2012) Comparison of estimators in nested case–control studies with multiple outcomes. Lifetime Data Anal 18:261–283
Støer NC, Samuelsen SO (2013) Inverse probability weighting in nested case–control studies with additional matching—a simulation study. Stat Med 32:5328–5339
Støer NC, Samuelsen SO (2014) multipleNCC: weighted Cox-regression for nested case-control data. http://CRAN.R-project.org/package=multipleNCC, R package version 1.0
Van Buuren S (2007) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16:219–242
Van Buuren S, Groothuis-Oudshoorn K (2011) Mice: multivariate imputation by chained equations in R. J Stat Softw 45:1–67
White IR, Royston P (2009) Imputing missing covariate values for the Cox model. Stat Med 28:1982–1998
White IR, Royston P, Wood AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30:377–399
Acknowledgments
Most of this research was done when Ørnulf Borgan was visiting the Department of Medical Statistics at London School of Hygiene and Tropical Medicine the spring of 2014. The department is acknowledged for its hospitality and for providing the best working facilities. We also want to thank Nathalie Støer for letting us use her new R package multipleNCC before it was made publicly available.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Borgan, Ø., Keogh, R. Nested case–control studies: should one break the matching?. Lifetime Data Anal 21, 517–541 (2015). https://doi.org/10.1007/s10985-015-9319-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-015-9319-y