×

Maximum likelihood estimation for incomplete multinomial data via the weaver algorithm. (English) Zbl 1406.62059

Summary: In a multinomial model, the sample space is partitioned into a disjoint union of cells. The partition is usually immutable during sampling of the cell counts. In this paper, we extend the multinomial model to the incomplete multinomial model by relaxing the constant partition assumption to allow the cells to be variable and the counts collected from non-disjoint cells to be modeled in an integrated manner for inference on the common underlying probability. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models including the Bradley-Terry model, the Plackett-Luce model, etc. Since the conventional method, which solves for the zeros of the score functions, is unfruitful, we develop a new approach to establishing a simpler set of estimating equations to obtain the maximum likelihood estimate (MLE), which seeks the simultaneous maximization of all multiplicative components of the likelihood by fitting each component into an inequality. As a consequence, our estimation amounts to solving a system of the equality attainment conditions to the inequalities. The resultant MLE equations are simple and immediately invite a fixed-point iteration algorithm for solution, which is referred to as the weaver algorithm. The weaver algorithm is short and amenable to parallel implementation. We also derive the asymptotic covariance of the MLE, verify main results with simulations, and compare the weaver algorithm with an MM/EM algorithm based on fitting a Plackett-Luce model to a benchmark data set.

MSC:

62H17 Contingency tables
62F07 Statistical ranking and selection procedures
65K10 Numerical optimization and variational techniques
62G07 Density estimation
Full Text: DOI

References:

[1] Agresti, A.: Categorical Data Analysis, 2nd edn. Wiley, New York (2003) · Zbl 0716.62001
[2] Bradley, RA; Terry, ME, Rank analysis of incomplete block designs: I. the method of paired comparisons, Biometrika, 39, 324-345, (1952) · Zbl 0047.12903 · doi:10.2307/2334029
[3] Caron, F; Doucet, A, Efficient Bayesian inference for generalized Bradley-Terry models, J. Comput. Graph. Stat., 21, 174-196, (2012) · doi:10.1080/10618600.2012.638220
[4] Chen, T; Fienberg, SE, The analysis of contingency tables with incompletely classified data, Biometrics, 32, 133-144, (1976) · Zbl 0328.62039 · doi:10.2307/2529344
[5] Connor, RJ; Mosimann, JE, Concepts of independence for proportions with a generalization of the Dirichlet distribution, J. Am. Stat. Assoc., 64, 194-206, (1969) · Zbl 0179.24101 · doi:10.1080/01621459.1969.10500963
[6] Cox, D.A., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithm: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3rd edn. Springer, New York (2007) · Zbl 1118.13001 · doi:10.1007/978-0-387-35651-8
[7] David, H.A.: The Method of Paired Comparisons, 2nd edn. Oxford University Press, Oxford (1988) · Zbl 0665.62075
[8] Davidson, R; Farquhar, P, A bibliography on the method of paired comparisons, Biometrics, 32, 241-252, (1976) · Zbl 0336.62067
[9] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B (Methodol.), 39, 1-38, (1977) · Zbl 0364.62022
[10] Diaconis, P.: In: Gupta, S.S. (ed.) Group Representations in Probability and Statistics, Lecture Notes-Monograph Series, vol. 11. Institute of Mathematical Statistics Hayward, CA. https://projecteuclid.org/euclid.lnms/1215467407 (1988) · Zbl 0695.60012
[11] Dickey, JM; Jiang, JM; Kadane, JB, Bayesian methods for censored categorical data, J. Am. Stat. Assoc., 82, 773-781, (1987) · Zbl 0637.62029 · doi:10.1080/01621459.1987.10478498
[12] Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th International Conference on World Wide Web, pp. 613-622. ACM (2001)
[13] Ford, LRJ, Solution of a ranking problem from binary comparisons, Am. Math. Mon., 64, 28-33, (1957) · Zbl 0089.15304 · doi:10.2307/2308513
[14] Gordon, L, Successive sampling in large finite populations, Ann. Stat., 11, 702-706, (1983) · Zbl 0521.62008 · doi:10.1214/aos/1176346175
[15] Gormley, IC; Murphy, TB, Exploring voting blocs within the irish electorate: a mixture modeling approach, J. Am. Stat. Assoc., 103, 1014-1027, (2008) · Zbl 1205.62198 · doi:10.1198/016214507000001049
[16] Guiver, J., Snelson, E.: Bayesian inference for Plackett-Luce ranking models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 377-384. ACM, Pittsburgh (2009)
[17] Haberman, SJ, Product models for frequency tables involving indirect observation, Ann. Stat., 5, 1124-1147, (1977) · Zbl 0371.62150 · doi:10.1214/aos/1176344000
[18] Hankin, RKS, A generalization of the Dirichlet distribution, J. Stat. Softw., 33, 1-18, (2010) · doi:10.18637/jss.v033.i11
[19] Hartley, HO; Hocking, RR, The analysis of incomplete data, Biometrics, 27, 783-823, (1971) · doi:10.2307/2528820
[20] Hastie, T; Tibshirani, R, Classification by pairwise coupling, Ann. Stat., 26, 451-471, (1998) · Zbl 0932.62071 · doi:10.1214/aos/1028144844
[21] Heiser, WJ; Krzanowski, WJ (ed.), Convergent computing by iterative majorization: theory and applications in multidimensional data analysis, 157-189, (1995), Oxford
[22] Huang, TK; Weng, RC; Lin, CJ, Generalized Bradley-Terry models and multi-class probability estimates, J. Mach. Learn. Res., 7, 85-115, (2006) · Zbl 1222.68219
[23] Hunter, DR, MM algorithms for generalized Bradley-Terry models, Ann. Stat., 32, 384-406, (2004) · Zbl 1105.62359 · doi:10.1214/aos/1079120141
[24] Hunter, DR; Lange, K, A tutorial on MM algorithms, Am. Stat., 58, 30-37, (2004) · doi:10.1198/0003130042836
[25] Jech, T, The ranking of incomplete tournaments: a mathematician’s guide to popular sports, Am. Math. Mon., 90, 246-266, (1983) · Zbl 0519.05034 · doi:10.1080/00029890.1983.11971205
[26] Kernighan, B.W., Ritchie, D.M.: In: Ritchie, D.M. (ed.) The C Programming Language, 2nd edn. Prentice Hall Professional Technical Reference, Upper Saddle River (1988)
[27] Lagarias, J; Reeds, J; Wright, M; Wright, P, Convergence properties of the Nelder-Mead simplex method in low dimensions, SIAM J. Optim., 9, 112-147, (1998) · Zbl 1005.90056 · doi:10.1137/S1052623496303470
[28] Laird, N, Nonparametric maximum likelihood estimation of a mixing distribution, J. Am. Stat. Assoc., 73, 805-811, (1978) · Zbl 0391.62029 · doi:10.1080/01621459.1978.10480103
[29] Lange, K.: Optimization, 2nd edn. Springer, New York (2013) · Zbl 1273.90002 · doi:10.1007/978-1-4614-5838-8
[30] Lange, K; Zhou, H, MM algorithms for geometric and signomial programming, Math. Program., 143, 339-356, (2014) · Zbl 1286.90110 · doi:10.1007/s10107-012-0612-1
[31] Lange, K; Hunter, DR; Yang, I, Optimization transfer using surrogate objective functions, J. Comput. Graph. Stat., 9, 1-59, (2000)
[32] Loève, M.: Probability Theory I, 4th edn. Springer, New York (1977) · Zbl 0359.60001
[33] Loève, M.: Probability Theory II, 4th edn. Springer, New York (1978) · Zbl 0385.60001 · doi:10.1007/978-1-4612-6257-2
[34] Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis. Wiley, New York (1959) · Zbl 0093.31708
[35] Luce, RD, The choice axiom after twenty years, J. Math. Psychol., 15, 215-223, (1977) · Zbl 0357.92033 · doi:10.1016/0022-2496(77)90032-3
[36] Marden, J.I.: Analyzing and Modeling Rank Data. Chapman & Hall/CRC, Boca Raton (1996) · Zbl 0853.62006
[37] MathWorks: Matlab documentation. URL https://www.mathworks.com/help/matlab/ref/profile.html (2017)
[38] McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New York (2008) · Zbl 1165.62019 · doi:10.1002/9780470191613
[39] Nelder, JA; Mead, R, A simplex method for function minimization, Comput. J., 7, 308-313, (1965) · Zbl 0229.65053 · doi:10.1093/comjnl/7.4.308
[40] Ng, K.W., Tian, G.L., Tang, M.L.: Dirichlet and Related Distributions: Theory, Methods and Applications. Wiley, New York (2011) · Zbl 1234.60006 · doi:10.1002/9781119995784
[41] NVIDIA: CUDA Toolkit Documentation v8.0. URL http://docs.nvidia.com/cuda/index.html (2017)
[42] Pistone, G., Riccomagno, E., Wynn, H.P.: Algebraic Statistics: Computational Commutative Algebra in Statistics. Chapman & Hall/CRC, Boca Raton (2000) · Zbl 0960.62003 · doi:10.1201/9781420035766
[43] Plackett, RL, The analysis of permutations, Appl. Stat., 24, 193-202, (1975) · doi:10.2307/2346567
[44] Sattath, S; Tversky, A, Unite and conquer: a multiplicative inequality for choice probabilities, Econometrica, 44, 79-89, (1976) · Zbl 0319.62073 · doi:10.2307/1911382
[45] Suppes, P., Krantz, D.H., Luce, R.D., Tversky, A.: Foundations of Measurement: Geometrical, Threshold, and Probabilistic Representations. Academic Press, New York (1971) · Zbl 0719.03003
[46] Tanner, M.A.: Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions. Springer, New York (1996) · Zbl 0846.62001 · doi:10.1007/978-1-4612-4024-2
[47] Thurstone, LL, Psychophysical analysis, Am. J. Psychol., 38, 368-389, (1927) · doi:10.2307/1415006
[48] Turnbull, BW, The empirical distribution function with arbitrarily grouped, censored and truncated data, J. R. Stat. Soc. Ser. B (Methodol.), 38, 290-295, (1976) · Zbl 0343.62033
[49] Tversky, A, Elimination by aspects: a theory of choice, Psychol. Rev., 79, 281-299, (1972) · doi:10.1037/h0032955
[50] Wu, CFJ, On the convergence properties of the EM algorithm, Ann. Stat., 11, 95-103, (1983) · Zbl 0517.62035 · doi:10.1214/aos/1176346060
[51] Yan, T; Yang, Y; Xu, J, Sparse paired comparisons in the Bradley-Terry model, Statistica Sinica, 22, 1305-1318, (2012) · Zbl 1534.62104 · doi:10.5705/ss.2010.299
[52] Zermelo, E, Die berechnung der turnier-ergebnisse als ein maximumproblem der wahrscheinlichkeitsrechnung, Mathematische Zeitschrift, 29, 436-460, (1929) · JFM 54.0543.01 · doi:10.1007/BF01180541
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.