×

Outlier-resistant estimators for average treatment effect in causal inference. (English) Zbl 07796624

Summary: The inverse probability weighting (IPW) and doubly robust (DR) estimators are often used to estimate the average treatment effect (ATE), but are vulnerable to outliers. The IPW/DR median can be used to provide an outlier-resistant estimation of the ATE, but this resistance is limited, and is not sufficiently resistant to heavy contamination. We propose extending the IPW/DR estimators using density power weighting, which eliminates the effects of outliers almost completely. The resistance of the proposed estimators to outliers is evaluated using the unbiasedness of the estimating equations. Unlike the median-based methods, our estimators are resistant to outliers, even under heavy contamination. Interestingly, the naive extension of the DR estimator requires a bias correction to maintain its double robustness, even under the most tractable form of contamination. In addition, the proposed estimators are found to be highly resistant to outliers in more difficult settings in which the contamination ratio depends on the covariates. The resistance of our estimators to outliers from the viewpoint of the influence function is also favorable. We verify our theoretical results using Monte Carlo simulations and a real-data analysis. The proposed methods are shown to have greater resistance to outliers than the median-based methods do, and we estimate the potential mean with a smaller error than that of the median-based methods.

MSC:

62D20 Causal inference from observational studies
62D10 Missing data
62F35 Robustness and adaptive procedures (parametric inference)

References:

[1] Bang, H. and Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962-973. · Zbl 1087.62121
[2] Basu, A., Harris, I. R., Hjort, N. L. and Jones, M. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika 85, 549-559. · Zbl 0926.62021
[3] Canavire-Bacarreza, G., Castro Peñarrieta, L. and Ugarte Ontiveros, D. (2021). Outliers in Semi-Parametric estimation of treatment effects. Econometrics 9, 19.
[4] Díaz, I. (2017). Efficient estimation of quantiles in missing data models. Journal of Statistical Planning and Inference 190, 39-51. · Zbl 1376.62017
[5] Firpo, S. (2007). Efficient semiparametric estimation of quantile treatment effects. Econometrica 75, 259-276. · Zbl 1201.62043
[6] Fujisawa, H. (2013). Normalized estimating equation for robust parameter estimation. Electronic Journal of Statistics 7, 1587-1606. · Zbl 1327.62182
[7] Fujisawa, H. and Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis 99, 2053-2081. · Zbl 1169.62010
[8] Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (2011). Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons.
[9] Hernán, M. A. and Robins, J. M. (2020). Causal Inference: What If. Chapman & Hall/CRC, Boca Raton.
[10] Hoshino, T. (2007). Doubly robust-type estimation for covariate adjustment in latent variable modeling. Psychometrika 72, 535-549. · Zbl 1291.62213
[11] Huber, P. J. (2004). Robust Statistics. John Wiley & Sons.
[12] Imbens, G. W. and Rubin, D. B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press. · Zbl 1355.62002
[13] Jones, M., Hjort, N. L., Harris, I. R. and Basu, A. (2001). A comparison of related density-based minimum divergence estimators. Biometrika 88, 865-873. · Zbl 1180.62047
[14] Kanamori, T. and Fujisawa, H. (2015). Robust estimation under heavy contamination using unnormalized models. Biometrika 102, 559-572. · Zbl 1452.62242
[15] Kawashima, T. and Fujisawa, H. (2017). Robust and sparse regression via γ-divergence. Entropy 19, 608.
[16] Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine 23, 2937-2960.
[17] Maechler, M., Rousseeuw, P., Croux, C., Todorov, V., Ruckstuhl, A., Salibian-Barrera, M. et al. (2021). robustbase: Basic Robust Statistics, R package (R package version 0.93.9). Web: https://robustbase.R-forge.R-project.org/.
[18] Maronna, R. A., Martin, R. D., Yohai, V. J. and Salibián-Barrera, M. (2019). Robust Statistics: Theory and Methods (with R). John Wiley & Sons. · Zbl 1409.62009
[19] Robins, J. M. and Rotnitzky, A. (1995). Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association 90, 122-129. · Zbl 0818.62043
[20] Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89, 846-866. · Zbl 0815.62043
[21] Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41-55. · Zbl 0522.62091
[22] Rousseeuw, P. J. and van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association 85, 633-639.
[23] Scharfstein, D. O., Rotnitzky, A. and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94, 1096-1120. · Zbl 1072.62644
[24] Sued, M., Valdora, M. and Yohai, V. (2020). Robust doubly protected estimators for quantiles with missing data. TEST 63, 819-843. · Zbl 1458.62092
[25] Tsiatis, A. (2006). Semiparametric Theory and Missing Data. Springer New York. · Zbl 1105.62002
[26] Van der Laan, M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics 2.
[27] Van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge University Press. · Zbl 0943.62002
[28] Windham, M. P. (1995). Robustifying model fitting. Journal of the Royal Statistical Society. Series B (Methodological), 599-609. · Zbl 0827.62030
[29] Yohai, V. J. (1987). High breakdown-point and high efficiency robust estimates for regression. The Annals of Statistics, 642-656. · Zbl 0624.62037
[30] Zhang, Z., Chen, Z., Troendle, J. F. and Zhang, J. (2012). Causal inference on quantiles with an obstetric application. Biometrics 68, 697-706. · Zbl 1272.62102
[31] Kazuharu Harada Department of Health Data Science, Tokyo Medical University, Shinjuku-ku, Tokyo 160-8402, Japan. E-mail: haradak@tokyo-med.ac.jp Hironori Fujisawa Department of Statistical Inference and Mathematics, The Institute of Statistical Mathematics, Tachikawa, Tokyo 190-8562, Japan.
[32] E-mail: fujisawa@ism.ac.jp (Received July 2021; accepted April 2022)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.