×

Connections between survey calibration estimators and semiparametric models for incomplete data. (English. French summary) Zbl 1422.62048

Summary: Survey calibration (or generalized raking) estimators are a standard approach to the use of auxiliary information in survey sampling, improving on the simple Horvitz-Thompson estimator. In this paper we relate the survey calibration estimators to the semiparametric incomplete-data estimators of Robins and coworkers, and to adjustment for baseline variables in a randomized trial. The development based on calibration estimators explains the “estimated weights” paradox and provides useful heuristics for constructing practical estimators. We present some examples of using calibration to gain precision without making additional modelling assumptions in a variety of regression models.

MSC:

62D05 Sampling theory, sample surveys
62N02 Estimation in survival analysis and censored data
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

Survey

References:

[1] Binder, On the variances of asymptotically normal estimators from complex surveys, Int. Statist. Rev. 51 pp 279– (1983) · Zbl 0535.62014 · doi:10.2307/1402588
[2] Binder, Fitting Cox’s proportional hazards models from survey data, Biometrika 79 pp 139– (1992) · doi:10.1093/biomet/79.1.139
[3] Bingham, Urine nitrogen as an independent validatory measure of dietary intake: a study of nitrogen balance in individuals consuming their normal diet, Amer. J. Clin. Nutr. 42 pp 1276– (1985)
[4] Borgan, Exposure stratified case-cohort designs, Lifetime Data Anal. 6 (1) pp 39– (2000) · Zbl 0948.62069 · doi:10.1023/A:1009661900674
[5] Breslow, Improved Horvitz-Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology, Stat. Biosci. 1 (2009) · doi:10.1007/s12561-009-9001-6
[6] Breslow, Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression, Scand. J. Statist. 34 pp 86– (2007) · Zbl 1142.62014 · doi:10.1111/j.1467-9469.2006.00523.x
[7] Carroll, A new class of measurement error models, with applications to dietary data, Canad. J. Statist. 26 pp 467– (1998) · Zbl 0920.62082 · doi:10.2307/3315770
[8] Chatterjee, Semiparametric maximum-likelihood estimation exploiting gene-environment independence in case-control studies, Biometrika 92 pp 399– (2005) · Zbl 1094.62136 · doi:10.1093/biomet/92.2.399
[9] Cochran, Sampling Techniques (1977)
[10] Deville, Calibration estimators in survey sampling, J. Amer. Statist. Assoc. 87 pp 376– (1992) · Zbl 0760.62010 · doi:10.2307/2290268
[11] Deville, Generalized raking procedures in survey sampling, J. Amer. Statist. Assoc. 88 pp 1013– (1993) · Zbl 0794.62005 · doi:10.2307/2290793
[12] Estevao, Borrowing strength is not the best technique within a wide class of design-consistent estimators, J. Official Statist. 20 pp 645– (2004)
[13] Henmi, A paradox concerning nuisance parameters and projected estimating functions, Biometrika 91 (4) pp 929– (2004) · Zbl 1064.62002 · doi:10.1093/biomet/91.4.929
[14] Huang, Cox regression with accurate covariates ascertainable:a nonparametric correction approach, J. Amer. Statist. Assoc. 45 (452) pp 1209– (2000) · Zbl 1008.62040 · doi:10.2307/2669761
[15] Huang, Errors-in-covariates effect on estimating functions: additivity in the limit and nonparametric correction, Statist. Sinica 16 (3) pp 861– (2006) · Zbl 1107.62035
[16] Isaki, Survey design under the regression superpopulation model, J. Amer. Statist. Assoc. 77 (377) pp 89– (1982) · Zbl 0511.62016 · doi:10.2307/2287773
[17] Jiang, Parameterization and inference for nonparametric regression problems, J. R. Stat. Soc. Ser. B 63 pp 583– (2001) · Zbl 0989.62024 · doi:10.1111/1467-9868.00300
[18] Judkins, Variable selection and raking in propensity scoring, Statist. Med. 26 pp 1022– (2007) · doi:10.1002/sim.2591
[19] Kaaks, Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments, Public Health Nutr. 5 (6A) pp 969– (2002) · doi:10.1079/PHN2002380
[20] Kipnis, Empirical evidence of correlated biases in dietary assessment instruments and its implications, Amer. J. Epidemiol. 153 pp 394– (2001) · doi:10.1093/aje/153.4.394
[21] Kipnis, Structure of dietary measurement error: results of the open biomarker study, Amer. J. Epidemiol. 158 pp 14– (2003) · doi:10.1093/aje/kwg091
[22] Krewski, Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods, Ann. Statist. 9 (5) pp 1010– (1981) · Zbl 0474.62013 · doi:10.1214/aos/1176345580
[23] Kulich, Improving the efficiency of relative-risk estimation in case-cohort studies, J. Amer. Statist. Assoc. 99 (467) pp 832– (2004) · Zbl 1117.62373 · doi:10.1198/016214504000000584
[24] Lin, On fitting Cox’s proportional hazards models to survey data, Biometrika 87 (1) pp 37– (2000) · Zbl 0974.62008 · doi:10.1093/biomet/87.1.37
[25] Lumley, Complex Surveys: A Guide to Analysis Using R (2010)
[26] Mark, Specifying and implementing nonparametric and semiparametric survival estimators in two-stage (nested) cohort studies with missing case data, J. Amer. Statist. Assoc. 101 (474) pp 460– (2006) · Zbl 1119.62365 · doi:10.1198/016214505000000952
[27] Nakamura, Proportional hazards model with covariates subject to measurement error, Biometrics 48 pp 829– (1992) · doi:10.2307/2532348
[28] Nan, Efficient estimation for case-cohort studies, Canad. J. Statist./La Revue Canadienne de Statistique 32 (4) pp 403– (2004) · Zbl 1059.62116 · doi:10.2307/3316024
[29] Pearl, Causality: Models, Reasoning, and Inference (2000) · Zbl 0959.68116
[30] Piegorsch, Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies, Statist. Med. 13 pp 153– (1994) · doi:10.1002/sim.4780130206
[31] Pierce, The asymptotic effect of substituting estimators for parameters in certain types of statistics, Ann. Statist. 10 pp 475– (1982) · Zbl 0488.62012 · doi:10.1214/aos/1176345788
[32] Prentice, Covariate measurement errors and parameter estimation in a failure time regression model, Biometrika 69 pp 331– (1982) · Zbl 0523.62083 · doi:10.1093/biomet/69.2.331
[33] Prentice, A case-cohort design for epidemiologic cohort studies and disease prevention trials, Biometrika 73 pp 1– (1986) · Zbl 0595.62111 · doi:10.1093/biomet/73.1.1
[34] Prentice, Measurement error and results from analytic epidemiology: dietary fat and breast cancer, J. Natl. Cancer Instit. 88 pp 1738– (1996) · doi:10.1093/jnci/88.23.1738
[35] Prentice, Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease, Public Health Nutr. 5 (6A) pp 977– (2002) · doi:10.1079/PHN2002382
[36] Psaty, Diuretic therapy, the alpha-adducin variant, and the risk of myocardial infarction or stroke in subjects with treated hypertension, JAMA 287 pp 1680– (2002) · doi:10.1001/jama.287.13.1680
[37] Rao, Estimating equations for the analysis of survey data using poststratification information, Sankhyā, Series A 64 (2) pp 364– (2002) · Zbl 1192.62023
[38] Robins, Discussion of: Firth, D. Robust Models in Probability Sampling, J. R. Stat. Soc. Ser. B Stat. Methodol. 60 pp 51– (1998)
[39] Robins, Estimation of regression coefficients when some regressors are not always observed, J. Amer. Statist. Assoc. 89 pp 846– (1994) · Zbl 0815.62043 · doi:10.2307/2290910
[40] Särndal, The calibration approach in survey theory and practice, Survey Methodol. 33 (2) pp 99– (2007)
[41] Särndal, Model Assisted Survey Sampling (2003)
[42] Schatzkin, Could exposure assessment problems give us wrong answers to nutrition and cancer questions?, J. Natl. Cancer Instit. 96 (21) pp 1564– (2004) · doi:10.1093/jnci/djh329
[43] Schoeller, Measurement of energy expenditure in humans by doubly labeled water method, J. Appl. Physiol. 53 pp 955– (1982)
[44] Scott, On the robustness of weighted methods for fitting models to case-control data, J. R. Stat. Soc. Ser. B Stat. Methodol. 64 (2) pp 207– (2002) · Zbl 1059.62010 · doi:10.1111/1467-9868.00333
[45] Self, Asymptotic distribution theory and efficiency results for case-cohort studies, Ann. Statist. 16 pp 64– (1988) · Zbl 0666.62108 · doi:10.1214/aos/1176350691
[46] Shaw , P.A. 2006 Estimation Methods for Cox Regression with Nonclassical Covariate Measurement Error PhD thesis
[47] Tsiatis, A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error, Biometrika 88 pp 447– (2001) · Zbl 0984.62078 · doi:10.1093/biomet/88.2.447
[48] Tsiatis, Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: a principled yet flexible approach, Statist. Med. 27 (23) pp 4658– (2008) · doi:10.1002/sim.3113
[49] Willett, Nutritional Epidemiology (1998) · doi:10.1093/acprof:oso/9780195122978.001.0001
[50] Zhang, Improving efficiency of inferences in randomized clinical trials using auxiliary covariates, Biometrics 64 (3) pp 707– (2008) · Zbl 1170.62082 · doi:10.1111/j.1541-0420.2007.00976.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.