×

Design and analysis of bipartite experiments under a linear exposure-response model. (English) Zbl 07650533

Summary: A bipartite experiment consists of one set of units being assigned treatments and another set of units for which we measure outcomes. The two sets of units are connected by a bipartite graph, governing how the treated units can affect the outcome units. In this paper, we consider estimation of the average total treatment effect in the bipartite experimental framework under a linear exposure-response model. We introduce the Exposure Reweighted Linear (ERL) estimator, and show that the estimator is unbiased, consistent and asymptotically normal, provided that the bipartite graph is sufficiently sparse. To facilitate inference, we introduce an unbiased and consistent estimator of the variance of the ERL point estimator. Finally, we introduce a cluster-based design, Exposure-Design, that uses heuristics to increase the precision of the ERL estimator by realizing a desirable exposure distribution.

MSC:

62K99 Design of statistical experiments
62D10 Missing data
62G99 Nonparametric inference

References:

[1] Angrist, J. D. (1998). Estimating the labor market impact of voluntary military service using social security data on military applicants. Econometrica, 66(2):249-288. · Zbl 1015.91521
[2] Aral, S. and Walker, D. (2011). Creating social contagion through viral product design: A randomized trial of peer influence in networks. Management science, 57(9):1623-1639.
[3] Aronow, P. M. and Samii, C. (2013). Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities. Survey Methodology, 39(1):231-241.
[4] Aronow, P. M. and Samii, C. (2015). Does regression produce representative estimates of causal effects? American Journal of Political Science, 60(1):250-267.
[5] Aronow, P. M. and Samii, C. (2017). Estimating average causal effects under general interference, with application to a social network experiment. The Annals of Applied Statistics, 11(4):1912-1947. · Zbl 1383.62329
[6] Aydin, K., Bateni, M., and Mirrokni, V. (2019). Distributed balanced partitioning via linear embedding. Algorithms, 12(8):162. · Zbl 1461.68141
[7] Baird, S., Bohren, J. A., McIntosh, C., and Özler, B. (2018). Optimal Design of Experiments in the Presence of Interference. The Review of Economics and Statistics, 100(5):844-860.
[8] Bansal, N., Blum, A., and Chawla, S. (2002). Correlation clustering. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS’02, page 238. IEEE Computer Society. · Zbl 1089.68085
[9] Basse, G. W. and Airoldi, E. M. (2018). Limitations of design-based causal inference and A/B testing under arbitrary and network interference. Sociological Methodology, 48(1):136-151.
[10] Basse, G. W., Soufiani, H. A., and Lambert, D. (2016). Randomization and the pernicious effects of limited budgets on auction experiments. In Artificial Intelligence and Statistics, pages 1412-1420. PMLR.
[11] Blake, T. and Coey, D. (2014). Why marketplace experimentation is harder than it seems: The role of test-control interference. In Proceedings of the fifteenth ACM conference on Economics and computation, pages 567-582.
[12] Bramoullé, Y., Djebbari, H., and Fortin, B. (2009). Identification of peer effects through social networks. Journal of Econometrics, 150(1):41-55. · Zbl 1429.91254
[13] Chamberlain, G. (1984). Panel data. In Griliches, Z. and Intriligator, M. D., editors, Handbook of Econometrics, volume 2, pages 1247-1318. Elsevier. · Zbl 0585.62185
[14] Charikar, M., Guruswami, V., and Wirth, A. (2005). Clustering with qualitative information. J. Comput. Syst. Sci., 71(3):360-383. · Zbl 1094.68075
[15] Chin, A. (2019a). Central limit theorems via Stein’s method for randomized experiments under interference. arXiv:1804.03105.
[16] Chin, A. (2019b). Regression adjustments for estimating the global treatment effect in experiments with interference. Journal of Causal Inference, 7(2).
[17] Cohen, E. and Lewis, D. D. (1999). Approximating matrix multiplication for pattern recognition tasks. Journal of Algorithms, 30(2):211 - 252. · Zbl 0923.68110
[18] Doudchenko, N., Zhang, M., Drynkin, E., Airoldi, E., Mirrokni, V., and Pouget-Abadie, J. (2020). Causal inference with bipartite designs. arXiv preprint arXiv:2010.02108.
[19] Eckles, D., Karrer, B., and Ugander, J. (2016a). Design and analysis of experiments in networks: Reducing bias from interference. Journal of Causal Inference, 5(1).
[20] Eckles, D., Kizilcec, R. F., and Bakshy, E. (2016b). Estimating peer effects in networks with peer encouragement designs. Proceedings of the National Academy of Sciences, 113(27):7316-7322.
[21] Einav, L., Kuchler, T., Levin, J. D., and Sundaresan, N. (2011). Learning from seller experiments in online markets. Technical report, National Bureau of Economic Research.
[22] Elsner, M. and Schudy, W. (2009). Bounding and comparing methods for correlation clustering beyond ilp. In Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, ILP’09, page 19-27. Association for Computational Linguistics.
[23] Fattorini, L. (2006). Applying the Horvitz-Thompson criterion in complex designs: A computer-intensive perspective for estimating inclusion probabilities. Biometrika, 93(2):269-278. · Zbl 1153.62304
[24] Fradkin, A. (2015). Search frictions and the design of online marketplaces. Work. Pap., Mass. Inst. Technol.
[25] Fradkin, A. (2017). Search, matching, and the role of digital marketplace design in enabling trade: Evidence from airbnb. Available at SSRN: https://ssrn.com/abstract=2939084.
[26] Goldberger, A. S. (1991). A Course in Econometrics. Harvard University Press, Cambridge.
[27] Gupta, S., Kohavi, R., Tang, D., Xu, Y., Andersen, R., Bakshy, E., Cardin, N., Chandran, S., Chen, N., Coey, D., et al. (2019). Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter, 21(1):20-35.
[28] Halloran, M. E. and Hudgens, M. G. (2016). Dependent happenings: A recent methodological review. Current Epidemiology Reports, 3(4):297-305.
[29] Harshaw, C., Sävje, F., Spielman, D., and Zhang, P. (2021). Balancing covariates in randomized experiments with the gram-schmidt walk design. arXiv:1911.03071.
[30] He, R. and McAuley, J. (2016). Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web, pages 507-517.
[31] Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396):945-960. · Zbl 0607.62001
[32] Holtz, D., Lobel, R., Liskovich, I., and Aral, S. (2020). Reducing interference bias in online marketplace pricing experiments. arXiv preprint arXiv:2004.12489.
[33] Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47(260):663-685. · Zbl 0047.38301
[34] Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association, 103(482):832-842. · Zbl 1471.62507
[35] Imbens, G. W. and Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge University Press. · Zbl 1355.62002 · doi:10.1017/CBO9781139025751
[36] Johari, R., Li, H., and Weintraub, G. (2020). Experimental design in two-sided platforms: An analysis of bias. In Proceedings of the 21st ACM Conference on Economics and Computation, page 851.
[37] Kempton, R. (1997). Interference between plots. In Statistical methods for plant variety evaluation, pages 101-116. Springer.
[38] Li, X., Ding, P., and Rubin, D. B. (2018). Asymptotic theory of rerandomization in treatment-control experiments. Proceedings of the National Academy of Sciences, 115(37):9157-9162. · Zbl 1416.62440
[39] Liu, M., Mao, J., and Kang, K. (2020). Trustworthy online marketplace experimentation with budget-split design. arXiv preprint arXiv:2012.08724.
[40] Lock Morgan, K. and Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. Annals of Statistics, 40(2):1263-1282. · Zbl 1274.62509
[41] Manski, C. F. (1991). Regression. Journal of Economic Literature, 29(1):34-50.
[42] Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. The Review of Economic Studies, 60(3):531-542. · Zbl 0800.90377
[43] McAuley, J., Targett, C., Shi, Q., and Van Den Hengel, A. (2015). Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pages 43-52.
[44] Miratrix, L. W., Sekhon, J. S., and Yu, B. (2013). Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(2):369-396. · Zbl 07555452
[45] Narain, R. (1951). On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Studies, (3):169-175.
[46] Neyman, J. (1923). Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes. Roczniki Nauk Rolniczych, 10:1-51.
[47] Offer-Westort, M. and Dimmery, D. (2021). Experimentation for homogenous policy change. arXiv:2101.12318.
[48] Ogburn, E. L., Sofrygin, O., Diaz, I., and van der Laan, M. J. (2020). Causal inference for social network data. arXiv:1705.08527.
[49] Pouget-Abadie, J., Aydin, K., Schudy, W., Brodersen, K., and Mirrokni, V. (2019). Variance reduction in bipartite experiments through correlation clustering. In Advances in Neural Information Processing Systems, pages 13309-13319.
[50] Reiley, D. H. (2006). Field experiments on the effects of reserve prices in auctions: More magic on the internet. The RAND Journal of Economics, 37(1):195-211.
[51] Robinson, P. M. (1988). Root-n-consistent semiparametric regression. Econometrica, 56(4):931-954. · Zbl 0647.62100
[52] Ross, N. (2011). Fundamentals of Stein’s method. Probability Surveys, 8:210-293. · Zbl 1245.60033
[53] Sävje, F. (2021). Causal inference with misspecified exposure mappings. arXiv:2103.06471.
[54] Sävje, F., Aronow, P. M., and Hudgens, M. G. (2021). Average treatment effects in the presence of unknown interference. The Annals of Statistics, 49(2):673-701. · Zbl 1472.62022
[55] Sinclair, B., McConnell, M., and Green, D. P. (2012). Detecting spillover effects: Design and analysis of multilevel experiments. American Journal of Political Science, 56(4):1055-1069.
[56] Sloczynski, T. (2020). Interpreting ols estimands when treatment effects are heterogeneous: Smaller groups get larger weights. Review of Economics and Statistics, in press.
[57] Struchiner, C. J., Halloran, M. E., Robins, J. M., and Spielman, A. (1990). The behaviour of common measures of association used to assess a vaccination programme under complex disease transmission patterns—a computer simulation study of malaria vaccines. International journal of epidemiology, 19(1):187-196.
[58] Swamy, C. (2004). Correlation clustering: Maximizing agreements via semidefinite programming. SODA’04, page 526-527. Society for Industrial and Applied Mathematics. · Zbl 1318.68197
[59] Toulis, P. and Kao, E. (2013). Estimation of causal peer influence effects. In Proceedings of the 30th International Conference on Machine Learning, volume 28, pages 1489-1497.
[60] Wilks, S. (1932). Certain Generalizations in the Analysis of Variance. Biometrika, 24(3-4):471-494. · JFM 58.1172.02
[61] Zigler, C. M. and Papadogeorgou, G. (2021). Bipartite causal inference with interference. Statist. Sci., 36(1):109-123. · Zbl 07368222
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.