×

Local model uncertainty and incomplete-data bias. (English) Zbl 1095.62035

Summary: Problems of the analysis of data with incomplete observations are all too familiar in statistics. They are doubly difficult if we are also uncertain about the choice of a model. We propose a general formulation for the discussion of such problems and develop approximations to the resulting bias of maximum likelihood estimates on the assumption that model departures are small. Loss of efficiency in parameter estimation due to incompleteness in the data has a dual interpretation: the increase in variance when an assumed model is correct; the bias in estimation when the model is incorrect. Examples include non-ignorable missing data, hidden confounders in observational studies and publication bias in meta-analysis. Doubling variances before calculating confidence intervals or test statistics is suggested as a crude way of addressing the possibility of undetectably small departures from the model. The problem of assessing the risk of lung cancer from passive smoking is used as a motivating example.

MSC:

62F99 Parametric inference
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI

References:

[1] Amari S., Lect. Notes Statist. 28 (1985) · doi:10.1007/978-1-4612-5056-2
[2] Baker S. G., J. Am. Statist. Ass. 83 pp 62– (1988)
[3] Chambers R. L., J. R. Statist. Soc. 55 pp 157– (1993)
[4] DOI: 10.1111/1467-985X.00123 · doi:10.1111/1467-985X.00123
[5] DOI: 10.1111/1467-9868.00318 · Zbl 0988.62074 · doi:10.1111/1467-9868.00318
[6] DOI: 10.1111/j.0006-341X.2004.00161.x · Zbl 1130.62404 · doi:10.1111/j.0006-341X.2004.00161.x
[7] DOI: 10.1111/1467-9868.00055 · doi:10.1111/1467-9868.00055
[8] Copas J. B., Br. Med. J. 320 pp 417– (2000)
[9] DOI: 10.1093/biostatistics/1.3.247 · Zbl 0958.62102 · doi:10.1093/biostatistics/1.3.247
[10] DOI: 10.1191/096228001670140950 · doi:10.1191/096228001670140950
[11] Cox D. R., J. R. Statist. Soc. 34 pp 187– (1972)
[12] Crowder M., Int. Statist. Rev. 62 pp 379– (1994)
[13] Crowder M., Classical Competing Risks (2001) · Zbl 0979.62078 · doi:10.1201/9781420035902
[14] Dempster A. P., J. R. Statist. Soc. 39 pp 1– (1977)
[15] Department of Health, Report of the Scientific Committee on Tobacco and Health (the SCOTH Report) (1998)
[16] DOI: 10.1016/0197-2456(86)90046-2 · doi:10.1016/0197-2456(86)90046-2
[17] Diggle P., Appl. Statist. 43 pp 49– (1994)
[18] Draper D., J. R. Statist. Soc. 57 pp 45– (1995)
[19] Duval S., J. Am. Statist. Ass. 95 pp 89– (2000)
[20] Egger M., Br. Med. J. 315 pp 629– (1997) · doi:10.1136/bmj.315.7109.629
[21] DOI: 10.1111/1467-9868.00108 · Zbl 0910.62010 · doi:10.1111/1467-9868.00108
[22] DOI: 10.1214/ss/1030037958 · doi:10.1214/ss/1030037958
[23] DOI: 10.1111/1467-9876.00074 · Zbl 0886.62104 · doi:10.1111/1467-9876.00074
[24] Greenhouse J., Handbook of Research Synthesis (1994)
[25] DOI: 10.1198/01621450338861905 · Zbl 1047.62106 · doi:10.1198/01621450338861905
[26] DOI: 10.1111/j.1467-985X.2004.00349.x · Zbl 1099.62129 · doi:10.1111/j.1467-985X.2004.00349.x
[27] DOI: 10.1111/1467-9868.00277 · Zbl 0976.62022 · doi:10.1111/1467-9868.00277
[28] Gustafson P., Can. J. Statist. 30 pp 463– (2002)
[29] Hackshaw A. K., Br. Med. J. 315 pp 980– (1997) · doi:10.1136/bmj.315.7114.980
[30] Heckman J. J., Econometrica 47 pp 153– (1979)
[31] Heckman J. J., Biometrika 76 pp 325– (1989)
[32] Hedges L. V., J. Educ. Statist. 9 pp 61– (1984)
[33] Heitjan D. F., Ann. Statist. 19 pp 2244– (1991)
[34] Holland P. W., J. Am. Statist. Ass. 81 pp 945– (1986)
[35] DOI: 10.1111/1467-9876.00269 · Zbl 1111.62379 · doi:10.1111/1467-9876.00269
[36] Jacobsen M., Ann. Statist. 23 pp 774– (1995)
[37] Lane D. M., Br. J. Math. Statist. Psychol. 31 pp 107– (1978) · doi:10.1111/j.2044-8317.1978.tb00578.x
[38] Lawless J. F., Statistical Models and Methods for Lifetime Data (2003) · Zbl 1015.62093
[39] Little R. J. A., Econometrica 53 pp 1469– (1985)
[40] Little R. J. A., J. Am. Statist. Ass. 90 pp 1112– (1995)
[41] Little R. J. A., Statistical Analysis with Missing Data (2002) · Zbl 1011.62004
[42] DOI: 10.1214/009053604000000166 · Zbl 1048.62007 · doi:10.1214/009053604000000166
[43] DOI: 10.1007/BF00985770 · Zbl 0960.62555 · doi:10.1007/BF00985770
[44] Nilsson R., Risk Assessmnt 21 pp 373– (2001)
[45] Park T., J. Am. Statist. Ass. 89 pp 44– (1994)
[46] Pearl J., Causality: Models, Reasoning, and Inference (2000) · Zbl 0959.68116
[47] Rosenbaum P. R., Observational Studies (2002) · Zbl 0985.62091
[48] DOI: 10.1093/biomet/91.1.153 · Zbl 1132.62363 · doi:10.1093/biomet/91.1.153
[49] DOI: 10.1111/1467-9868.00392 · Zbl 1065.62047 · doi:10.1111/1467-9868.00392
[50] Rubin D. B., J. Educ. Psychol. 66 pp 688– (1974)
[51] Rubin D. B., J. Am. Statist. Ass. 72 pp 538– (1977)
[52] Schafer J. L., Analysis of Incomplete Multivariate Data (1997) · Zbl 0997.62510 · doi:10.1201/9781439821862
[53] DOI: 10.1093/biostatistics/4.4.495 · Zbl 1154.62401 · doi:10.1093/biostatistics/4.4.495
[54] DOI: 10.1093/biomet/89.3.617 · Zbl 1036.62110 · doi:10.1093/biomet/89.3.617
[55] Scharfstein O. S., J. Am. Statist. Ass. 94 pp 1096– (1999)
[56] Sen P. K., Encyclopedia of Statistical Sciences 5 pp 95– (1985)
[57] DOI: 10.1111/1467-9868.00334 · Zbl 1059.62059 · doi:10.1111/1467-9868.00334
[58] DOI: 10.1093/biostatistics/kxh019 · Zbl 1069.62077 · doi:10.1093/biostatistics/kxh019
[59] Sutton A. J., Methods for Meta-analysis in Medical Research (2000)
[60] DOI: 10.1191/096228000701555244 · doi:10.1191/096228000701555244
[61] DOI: 10.1093/biomet/90.4.747 · Zbl 1436.62206 · doi:10.1093/biomet/90.4.747
[62] Tsiatis A., Proc. Natn. Acad. Sci. USA 72 pp 20– (1972)
[63] DOI: 10.1016/0304-4076(82)90100-2 · doi:10.1016/0304-4076(82)90100-2
[64] DOI: 10.1002/(SICI)1097-0258(19960215)15:3<249::AID-SIM160>3.3.CO;2-A · doi:10.1002/(SICI)1097-0258(19960215)15:3<249::AID-SIM160>3.3.CO;2-A
[65] DOI: 10.1111/1468-0262.00274 · Zbl 1104.62323 · doi:10.1111/1468-0262.00274
[66] H. An (2004 ) Robust likelihood-based inference for multivariate data with missing values .Doctoral Dissertation. Department of Biostatistics, University of Michigan, Ann Arbor.
[67] Baker S. G., J. Am. Statist. Ass. 83 pp 62– (1988)
[68] Balke A., J. Am. Statist. Ass. 92 pp 1171– (1997)
[69] Barndorff-Nielsen O. E., Int. Statist. Rev. 62 pp 133– (1994)
[70] Bates R. A., J. R. Statist. Soc. 58 pp 77– (1996)
[71] Berk R. A., Regression Analysis: a Constructive Critique (2004) · doi:10.4135/9781483348834
[72] DOI: 10.1111/j.1467-9868.2005.00508.x · Zbl 1069.62061 · doi:10.1111/j.1467-9868.2005.00508.x
[73] Box G. E. P., J. Am. Statist. Ass. 74 pp 1– (1979)
[74] Box G. E. P., J. R. Statist. Soc. 13 pp 1– (1951)
[75] Burnham K. P., Model Selection and Multi-model Inference (2002) · Zbl 1005.62007
[76] Carroll R. J., Measurement Error in Nonlinear Models (1995) · Zbl 0853.62048 · doi:10.1007/978-1-4899-4477-1
[77] Chatfield C., J. R. Statist. Soc. 158 pp 419– (1995)
[78] Chatfield C., Time-series Forecasting (2001)
[79] Chesher A., Biometrika 78 pp 451– (1991)
[80] Clarke P. S., Methodological Working Paper M03/23 (2003)
[81] DOI: 10.1111/j.1369-7412.2003.04973.x · Zbl 1062.62127 · doi:10.1111/j.1369-7412.2003.04973.x
[82] Coombs C. H., A Theory of Data (1964)
[83] DOI: 10.1111/1467-9868.00318 · Zbl 0988.62074 · doi:10.1111/1467-9868.00318
[84] DOI: 10.1111/j.0006-341X.2004.00161.x · Zbl 1130.62404 · doi:10.1111/j.0006-341X.2004.00161.x
[85] Dempster A. P., J. R. Statist. Soc. 39 pp 1– (1977)
[86] Draper D., J. R. Statist. Soc. 57 pp 45– (1995)
[87] Draper D., Combining Information: Statistical Issues and Opportunities for Research (1993)
[88] Fedorov V. V., Theory of Optimal Experiments (1972)
[89] DOI: 10.1111/1467-9868.00108 · Zbl 0910.62010 · doi:10.1111/1467-9868.00108
[90] Fuller W. A., Measurement Error Models (1987) · Zbl 0800.62413 · doi:10.1002/9780470316665
[91] DOI: 10.1111/j.1467-985X.2004.00349.x · Zbl 1099.62129 · doi:10.1111/j.1467-985X.2004.00349.x
[92] DOI: 10.1097/00001648-199901000-00005 · doi:10.1097/00001648-199901000-00005
[93] Gustafson P., Can. J. Statist. 30 pp 463– (2002)
[94] Hackshaw A. K., Br. Med. J. 315 pp 980– (1997)
[95] DOI: 10.1073/pnas.96.8.4730 · doi:10.1073/pnas.96.8.4730
[96] DOI: 10.1119/1.14447 · doi:10.1119/1.14447
[97] Herzberg A. M., J. R. Statist. Soc. 38 pp 284– (1976)
[98] Horowitz J., J. Am. Statist. Ass. 95 pp 77– (2000)
[99] DOI: 10.1007/BF00532597 · Zbl 0315.60026 · doi:10.1007/BF00532597
[100] Kuchler U., Exponential Families of Stochastic Processes (1997)
[101] Law M. R., Br. Med. J. 315 pp 973– (1997) · doi:10.1136/bmj.315.7114.973
[102] DOI: 10.1017/S0266466603191050 · Zbl 1032.62011 · doi:10.1017/S0266466603191050
[103] DOI: 10.1111/j.1467-9868.2004.b5543.x · Zbl 1046.62118 · doi:10.1111/j.1467-9868.2004.b5543.x
[104] DOI: 10.1111/j.0006-341X.2002.00621.x · Zbl 1210.62181 · doi:10.1111/j.0006-341X.2002.00621.x
[105] Little R. J. A., Statist. Sin. 14 pp 949– (2004)
[106] Little R. J. A., Statistical Analysis with Missing Data (2002) · Zbl 1011.62004 · doi:10.1002/9781119013563
[107] Little R. J. A., Biometrics 52 pp 98– (1996)
[108] Longford N. T., Statist. Comput. 13 pp 391– (2003)
[109] Madigan D., J. Am. Statist. Ass. 89 pp 1535– (1994)
[110] Manski C., J. Hum. Res. 24 pp 343– (1989)
[111] Manski C., Am. Econ. Rev. Pap. Proc. 80 pp 319– (1990)
[112] Manski C., Identification Problems in the Social Sciences (1995)
[113] Manski C., Partial Identification of Probability Distributions (2003) · Zbl 1047.62001
[114] Manski C., J. Am. Statist. Ass. 87 pp 25– (1992)
[115] Moses L. E., The National Halothane Study (1969)
[116] Potscher B. M., Econometr. Theory 7 pp 163– (1991)
[117] DOI: 10.1093/biomet/83.2.251 · Zbl 0864.62049 · doi:10.1093/biomet/83.2.251
[118] Rosenbaum P. R., Observational Studies (1995) · Zbl 0851.62081 · doi:10.1007/978-1-4757-2443-1
[119] DOI: 10.1198/000313001317098220 · doi:10.1198/000313001317098220
[120] Rosenbaum P. R., Observational Studies (2002) · Zbl 0985.62091 · doi:10.1007/978-1-4757-3692-2
[121] DOI: 10.1111/1467-9868.00392 · Zbl 1065.62047 · doi:10.1111/1467-9868.00392
[122] Rubin D. B., Biometrika 63 pp 581– (1976)
[123] Rubin D. B., J. Am. Statist. Ass. 72 pp 538– (1977)
[124] Salzberg A. J., Am. Statistn 53 pp 103– (1999)
[125] Shadish W. R., Experimental and Quasi-experimental Designs for Generalized Causal Inference (2002)
[126] DOI: 10.1023/A:1009674915476 · Zbl 0924.62104 · doi:10.1023/A:1009674915476
[127] Troxel A., Statist. Sin. 14 pp 1221– (2004)
[128] DOI: 10.1111/j.0006-341X.2001.00007.x · Zbl 1209.62170 · doi:10.1111/j.0006-341X.2001.00007.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.