×

Matrix completion methods for causal panel data models. (English) Zbl 1506.15030

Summary: In this article, we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations. We propose a class of matrix completion estimators that uses the observed elements of the matrix of control outcomes corresponding to untreated unit/periods to impute the “missing” elements of the control outcome matrix, corresponding to treated units/periods. This leads to a matrix that well-approximates the original (incomplete) matrix, but has lower complexity according to the nuclear norm for matrices. We generalize results from the matrix completion literature by allowing the patterns of missing data to have a time series dependency structure that is common in social science applications. We present novel insights concerning the connections between the matrix completion literature, the literature on interactive fixed effects models and the literatures on program evaluation under unconfoundedness and synthetic control methods. We show that all these estimators can be viewed as focusing on the same objective function. They differ solely in the way they deal with identification, in some cases solely through regularization (our proposed nuclear norm matrix completion estimator) and in other cases primarily through imposing hard restrictions (the unconfoundedness and synthetic control approaches). The proposed method outperforms unconfoundedness-based or synthetic control estimators in simulations based on real data.

MSC:

15A83 Matrix completion problems

Software:

softImpute

References:

[1] Abadie, A., Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects, Journal of Economic Literature (2019)
[2] Abadie, A.; Cattaneo, M. D., “Econometric Methods for Program Evaluation, Annual Review of Economics, 10, 465-503 (2018) · doi:10.1146/annurev-economics-080217-053402
[3] Abadie, A.; Diamond, A.; Hainmueller, J., “Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program, Journal of the American Statistical Association, 105, 493-505 (2010) · doi:10.1198/jasa.2009.ap08746
[4] Abadie, A.; Diamond, A.; Hainmueller, J., “Comparative Politics and the Synthetic Control Method, American Journal of Political Science, 59, 495-510 (2015)
[5] Abadie, A.; Gardeazabal, J., “The Economic Costs of Conflict: A Case Study of the Basque Country, American Economic Review, 93, 113-132 (2003) · doi:10.1257/000282803321455188
[6] Amjad, M.; Shah, D.; Shen, D., “Robust synthetic control,”, The Journal of Machine Learning Research, 19, 802-852 (2018) · Zbl 1445.62113
[7] Anderson, T. W., An Introduction to Multivariate Statistical Analysis, 2 (1958), New York: Wiley, New York · Zbl 0083.14601
[8] Angrist, J.; Pischke, S., Mostly Harmless Econometrics: An Empiricists’ Companion (2008), Princeton, NJ: Princeton University Press, Princeton, NJ
[9] Arellano, M.; Honoré, B., “Panel Data Models: Some Recent Developments, Handbook of Econometrics, 5, 3229-3296 (2001)
[10] Arkhangelsky, D.; Athey, S.; Hirshberg, D. A.; Imbens, G. W.; Wager, S., forthcoming, “Synthetic Difference in Differences,” (2019), American Economic Review
[11] Athey, S., and Imbens, G. W. (2018), “Design-Based Analysis in Difference-in-Differences Settings With Staggered Adoption,” Technical Report, National Bureau of Economic Research.
[12] Athey, S.; Stern, S., “The Impact of Information Technology on Emergency Health Care Outcomes, The RAND Journal of Economics, 33, 399-432 (2002) · doi:10.2307/3087465
[13] Bai, J., “Inferential Theory for Factor Models of Large Dimensions, Econometrica, 71, 135-171 (2003) · Zbl 1136.62354 · doi:10.1111/1468-0262.00392
[14] Bai, J., “Panel Data Models With Interactive Fixed Effects, Econometrica, 77, 1229-1279 (2009) · Zbl 1183.62196
[15] Bai, J.; Ng, S., “Determining the Number of Factors in Approximate Factor Models, Econometrica, 70, 191-221 (2002) · Zbl 1103.91399 · doi:10.1111/1468-0262.00273
[16] Bai, J., and Ng, S. (2017), “Principal Components and Regularized Estimation of Factor Models,” arXiv no. 1708.08137.
[17] Ben-Michael, E., Feller, A., and Rothstein, J. (2018), “The Augmented Synthetic Control Method,” arXiv no. 1811.04170.
[18] Candès, E. J.; Plan, Y., “Matrix Completion With Noise, Proceedings of the IEEE, 98, 925-936 (2010) · doi:10.1109/JPROC.2009.2035722
[19] Candès, E. J.; Recht, B., “Exact Matrix Completion via Convex Optimization, Foundations of Computational Mathematics, 9, 717 (2009) · Zbl 1219.90124 · doi:10.1007/s10208-009-9045-5
[20] Candès, E. J.; Tao, T., “The Power of Convex Relaxation: Near-Optimal Matrix Completion, IEEE Transactions on Information Theory, 56, 2053-2080 (2010) · Zbl 1366.15021 · doi:10.1109/TIT.2010.2044061
[21] Chamberlain, G., “Panel Data, Handbook of Econometrics, 2, 1247-1318 (1984) · Zbl 0585.62185
[22] Chamberlain, G. (1993), “Feedback in Panel Data Models,” Technical Report, Harvard-Institute of Economic Research.
[23] Chernozhukov, V., Wuthrich, K., and Zhu, Y. (2017), “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls,” arXiv no. 1712.09089.
[24] Doudchenko, N., and Imbens, G. W. (2016), “Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis,” Technical Report, National Bureau of Economic Research.
[25] Ferman, B., and Pinto, C. (2019), “Synthetic Controls With Imperfect Pre-Treatment Fit,” arXiv no. 1911.08521.
[26] Gamarnik, D.; Misra, S., “A Note on Alternating Minimization Algorithm for the Matrix Completion Problem, IEEE Signal Processing Letters, 23, 1340-1343 (2016) · doi:10.1109/LSP.2016.2576979
[27] Gobillon, L.; Magnac, T., “Regional Policy Evaluation: Interactive Fixed Effects and Synthetic Controls, Review of Economics and Statistics, 98, 535-551 (2016) · doi:10.1162/REST_a_00537
[28] Goldberger, A. S., “Structural Equation Methods in the Social Sciences, Econometrica: Journal of the Econometric Society, 40, 979-1001 (1972) · doi:10.2307/1913851
[29] Gross, D., “Recovering Low-Rank Matrices From Few Coefficients in Any Basis, IEEE Transactions on Information Theory, 57, 1548-1566 (2011) · Zbl 1366.94103 · doi:10.1109/TIT.2011.2104999
[30] Hamidi, N., and Bayati, M. (2019), “On Low-Rank Trace Regression Under General Sampling Distribution,” arXiv no. 1904.08576.
[31] Hastie, T.; Mazumder, R.; Lee, J. D.; Zadeh, R., “Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares,”, Journal of Machine Learning Research, 16, 3367-3402 (2015) · Zbl 1352.65117
[32] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements of Statistical Learning (2009), New York: Springer, New York · Zbl 1273.62005
[33] Hernan, M. A.; Robins, J. M., Causal Inference (2010), Boca Raton, FL: CRC Press, Boca Raton, FL
[34] Hirano, K.; Imbens, G. W.; Ridder, G., “Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score, Econometrica, 71, 1161-1189 (2003) · Zbl 1152.62328 · doi:10.1111/1468-0262.00442
[35] Hsiao, C.; Steve Ching, H.; Ki Wan, S., “A Panel Data Approach for Program Evaluation: Measuring the Benefits of Political and Economic Integration of Hong Kong With Mainland China, Journal of Applied Econometrics, 27, 705-740 (2012) · doi:10.1002/jae.1230
[36] Imbens, G. W.; Rubin, D. B., Causal Inference in Statistics, Social, and Biomedical Sciences (2015), New York: Cambridge University Press, New York · Zbl 1355.62002
[37] Imbens, G. W.; Wooldridge, J. M., “Recent Developments in the Econometrics of Program Evaluation, Journal of Economic Literature, 47, 5-86 (2009) · doi:10.1257/jel.47.1.5
[38] Keshavan, R. H.; Montanari, A.; Oh, S., “Matrix Completion From a Few Entries, IEEE Transactions on Information Theory, 56, 2980-2998 (2010) · Zbl 1366.62111 · doi:10.1109/TIT.2010.2046205
[39] Keshavan, R. H.; Montanari, A.; Oh, S., “Matrix Completion From Noisy Entries, Journal of Machine Learning Research, 11, 2057-2078 (2010) · Zbl 1242.62069
[40] Kim, D.; Oka, T., “Divorce Law Reforms and Divorce Rates in the USA: An Interactive Fixed-Effects Approach, Journal of Applied Econometrics, 29, 231-245 (2014) · doi:10.1002/jae.2310
[41] Klopp, O., “Noisy Low-Rank Matrix Completion With General Sampling Distribution, Bernoulli, 20, 282-303 (2014) · Zbl 1400.62115 · doi:10.3150/12-BEJ486
[42] Koltchinskii, V.; Lounici, K.; Tsybakov, A. B., “Nuclear-Norm Penalization and Optimal Rates for Noisy Low-Rank Matrix Completion, The Annals of Statistics, 39, 2302-2329 (2011) · Zbl 1231.62097 · doi:10.1214/11-AOS894
[43] Ledoux, M.; Talagrand, M., Probability in Banach Spaces: Isoperimetry and Processes (2013), Berlin, Heidelberg: Springer, Berlin, Heidelberg
[44] Li, K. T., “Statistical Inference for Average Treatment Effects Estimated by Synthetic Control Methods, Journal of the American Statistical Association, 115, 2068-2083 (2020) · Zbl 1453.62330 · doi:10.1080/01621459.2019.1686986
[45] Liang, K.-Y.; Zeger, S. L., “Longitudinal Data Analysis Using Generalized Linear Models, Biometrika, 73, 13-22 (1986) · Zbl 0595.62110 · doi:10.1093/biomet/73.1.13
[46] Massart, P., “About the Constants in Talagrand’s Concentration Inequalities for Empirical Processes, The Annals of Probability, 28, 863-884 (2000) · Zbl 1140.60310 · doi:10.1214/aop/1019160263
[47] Mazumder, R.; Hastie, T.; Tibshirani, R., “Spectral Regularization Algorithms for Learning Large Incomplete Matrices,”, Journal of Machine Learning Research, 11, 2287-2322 (2010) · Zbl 1242.68237
[48] Moon, H. R.; Weidner, M., “Linear Regression for Panel With Unknown Number of Factors as Interactive Fixed Effects, Econometrica, 83, 1543-1579 (2015) · Zbl 1410.62126 · doi:10.3982/ECTA9382
[49] Moon, H. R.; Weidner, M., “Dynamic Linear Panel Regression Models With Interactive Fixed Effects, Econometric Theory, 33, 158-195 (2017) · Zbl 1441.62816
[50] Negahban, S. N.; Ravikumar, P.; Wainwright, M. J.; Yu, B., “A Unified Framework for High-Dimensional Analysis of M-Estimators With Decomposable Regularizers, Statistical Science, 27, 538-557 (2012) · Zbl 1331.62350 · doi:10.1214/12-STS400
[51] Negahban, S. N.; Wainwright, M. J., “Estimation of (Near) Low-Rank Matrices With Noise and High-Dimensional Scaling, The Annals of Statistics, 39, 1069-1097 (2011) · Zbl 1216.62090 · doi:10.1214/10-AOS850
[52] Negahban, S. N.; Wainwright, M. J., “Restricted Strong Convexity and Weighted Matrix Completion: Optimal Bounds With Noise, Journal of Machine Learning Research, 13, 1665-1697 (2012) · Zbl 1436.62204
[53] Pesaran, M. H., “Estimation and Inference in Large Heterogeneous Panels With a Multifactor Error Structure, Econometrica, 74, 967-1012 (2006) · Zbl 1152.91718 · doi:10.1111/j.1468-0262.2006.00692.x
[54] Recht, B., “A Simpler Approach to Matrix Completion, Journal of Machine Learning Research, 12, 3413-3430 (2011) · Zbl 1280.68141
[55] Rohde, A.; Tsybakov, A. B., “Estimation of High-Dimensional Low-Rank Matrices, The Annals of Statistics, 39, 887-930 (2011) · Zbl 1215.62056 · doi:10.1214/10-AOS860
[56] Rosenbaum, P. R.; Rubin, D. B., “The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika, 70, 41-55 (1983) · Zbl 0522.62091 · doi:10.1093/biomet/70.1.41
[57] Rubin, D. B., Matched Sampling for Causal Effects (2006), Cambridge: Cambridge University Press, Cambridge · Zbl 1118.62113
[58] Shaikh, A.; Toulis, P., Randomization Tests in Observational Studies With Staggered Adoption of Treatment, 2019-144 (2019), University of Chicago: Becker Friedman Institute for Economics Working Paper, University of Chicago
[59] Srebro, N.; Alon, N.; Jaakkola, T. S.; Saul, L. K.; Weiss, Y.; Bottou, L., Generalization Error Bounds for Collaborative Prediction With Low-Rank Matrices, Advances in Neural Information Processing Systems, 17, 1321-1328 (2005)
[60] Tibshirani, R., “Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288 (1996) · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x
[61] Tropp, J. A., “User-Friendly Tail Bounds for Sums of Random Matrices, Foundations of Computational Mathematics, 12, 389-434 (2012) · Zbl 1259.60008 · doi:10.1007/s10208-011-9099-z
[62] Wang, Y., Liang, D., Charlin, L., and Blei, D. M. (2018), “The Deconfounded Recommender: A Causal Inference Approach to Recommendation,” arXiv no. 1808.06581.
[63] Xu, Y., “Generalized Synthetic Control Method: Causal Inference With Interactive Fixed Effects Models, Political Analysis, 25, 57-76 (2017) · doi:10.1017/pan.2016.2
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.