×

Sampling and learning Mallows and generalized Mallows models under the Cayley distance. (English) Zbl 1433.62047

Summary: The Mallows and Generalized Mallows models are compact yet powerful and natural ways of representing a probability distribution over the space of permutations. In this paper, we deal with the problems of sampling and learning such distributions when the metric on permutations is the Cayley distance. We propose new methods for both operations, and their performance is shown through several experiments. An application in the field of biology is given to motivate the interest of this model.

MSC:

62E10 Characterization and structure theory of statistical distributions
62P10 Applications of statistics to biology and medical sciences; meta analysis
62D05 Sampling theory, sample surveys
92B15 General biostatistics

Software:

PerMallows

References:

[1] Arora R, Meila M (2013) Consensus ranking with signed permutations. In: Artificial Intelligence and Statistics (AISTATS), volume 31 of JMLR Proceedings, pp 117-125 · Zbl 1225.62067
[2] Arratia R, Barbour AD, Tavaré S (2003) Logarithmic combinatorial structures: a probabilistic approach. EMS Monographs in Mathematics. European Mathematical Society (EMS) Zürich · Zbl 1040.60001
[3] Bayer D, Diaconis P (1992) Trailing the dovetail shuffle to its lair. Ann Appl Probab 2(2):294-313 · Zbl 0757.60003 · doi:10.1214/aoap/1177005705
[4] Betzler N, Guo J, Komusiewicz C, Niedermeier R (2011) Average parameterization and partial kernelization for computing medians. J Comput Syst Sci 77(4):774-789 · Zbl 1215.68107 · doi:10.1016/j.jcss.2010.07.005
[5] Blei DM, Thomas LG, Jordan MI (2010) The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J ACM 57(2) · Zbl 1327.68187
[6] Bourque G, Pevzner PA (2002) Genome-scale evolution: Reconstructing gene orders in the ancestral species. Genome Res 1(12):26-36
[7] Caragiannis I, Procaccia AD, Shah N (2013) When do noisy votes reveal the Truth? In: Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pages 143-160, New York, NY, USA. ACM · Zbl 0757.60003
[8] Ceberio J, Mendiburu A, Lozano JA (2011) Introducing The Mallows Model on Estimation of Distribution Algorithms. In: International Conference on Neural Information Processing (ICONIP), number 23-25, Shanghai
[9] Cheng W, Hüllermeier E (2009) A New Instance-Based Label Ranking Approach Using the Mallows Model. In: Advances in Neural Networks (ISNN), volume 5551 of Lecture Notes in Computer Science. Springer-Verlag, pp 707-716 · Zbl 1215.68107
[10] Cheng W, Hullermeier E (2009) A Simple Instance-Based Approach to Multilabel Classification Using the Mallows Model. In: Workshop Proceedings of Learning from Multi-Label Data. Bled, Slovenia, pp 28-38
[11] Critchlow DE (1988) Ulam’s metric. In: Encyclopedia of Statistical Sciences, vol 9, pp 379-380 · Zbl 1429.62258
[12] Critchlow DE, Fligner MA, Verducci JS (1991) Probability models on rankings. J Math Psychol 35:294-318 · Zbl 0741.62024 · doi:10.1016/0022-2496(91)90050-4
[13] D’Elia A, Piccolo D (2005) A mixture model for preferences data analysis. Comput Stat Data Anal 49(3):917-934 · Zbl 1429.62077 · doi:10.1016/j.csda.2004.06.012
[14] Deza M, Huang T (1998) Metrics on permutations, a survey. J Combinatorics Inf Syst Sci 23:173-185 · Zbl 1217.05038
[15] Diaconis P (1988) Group representations in probability and statistics Institute of Mathematical Statistics · Zbl 0695.60012
[16] Diaconis P (2008) The Markov chain Monte Carlo revolution. Bull Am Math Soc 46(2):179-205 · Zbl 1168.60032 · doi:10.1090/S0273-0979-08-01238-X
[17] Diaconis P, Holmes S (2002) A Bayesian Peek into Feller Volume I. Sankhya: The Indian Journal of Statistics, Series A (1961-2002), 64(3) · Zbl 1192.60003
[18] Diaconis P, Ram A (2000) Analysis of systematic scan Metropolis algorithms using Iwahori-Hecke algebra techniques. Mich Math J 48(1):157-190 · Zbl 0998.60069 · doi:10.1307/mmj/1030132713
[19] Donnelly P (1986) Partition structures, Polya urns, the Ewens sampling formula, and the ages of alleles. Theor Popul Biol 30(2):271-288 · Zbl 0608.92005 · doi:10.1016/0040-5809(86)90037-7
[20] Evans SN, Grubel R, Wakolbinger A (2012) Trickle-down processes and their boundaries. Electronic Journal of Probability, 17 · Zbl 1246.60100
[21] Warren JE (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87-112 · Zbl 0245.92009 · doi:10.1016/0040-5809(72)90035-4
[22] Feller W (1968) An Introduction to Probability Theory and Its Applications, vol 1. Wiley, New York · Zbl 0155.23101
[23] Féray V (2013) Asymptotics of some statistics in Ewens random permutations. Electron J Probab 18(76):1-32 · Zbl 1314.05016
[24] Fligner MA, Verducci JS (1986) Distance based ranking models. J R Stat Soc 48(3):359-369 · Zbl 0658.62031
[25] Fligner MA, Verducci JS (1988) Multistage ranking models. J Am Stat Assoc 83(403):892-901 · Zbl 0719.62036 · doi:10.1080/01621459.1988.10478679
[26] Fligner MA, Verducci JS (1993) Probability models and statistical analyses for ranking data Springer-Verlag · Zbl 0754.00011
[27] Gnedin A, Olshanski G (2012) The two-sided infinite extension of the Mallows model for random permutations. Adv Appl Math 48(5):615-639 · Zbl 1242.60010 · doi:10.1016/j.aam.2012.01.001
[28] Gupta J, Damien P (2002) Conjugacy class prior distributions on metric-based ranking models. J R Stat Soc B 64(3):433-445 · Zbl 1090.62021 · doi:10.1111/1467-9868.00343
[29] Huang J, Guestrin C, Guibas L (2007) Efficient Inference for Distributions on Permutations. Representations, pages 1-8
[30] Lebanon G, Cranking JL (2002) Combining rankings using conditional probability models on permutations. In: International Conference on Machine Learning (ICML), pp 363-370
[31] Tyler L, Boutilier C (2011) Learning mallows models with pairwise preferences · Zbl 1312.68171
[32] Duncan Luce R (1959) Individual choice behavior. Wiley, New York · Zbl 1191.91003
[33] Mallows CL (1957) Non-null ranking models. Biometrika 44(1-2):114-130 · Zbl 0087.34001 · doi:10.1093/biomet/44.1-2.114
[34] Mandhani B, Meila M (2009) Tractable search for learning exponential models of rankings. J Mach Learn Res 5:392-399
[35] Yi M, Lebanon G (2008) Non-Parametric Modeling of partially ranked data. J Mach Learn Res 9:2401-2429 · Zbl 1225.62067
[36] McCullagh P (2011) Random Permutations and Partition Models. In: International Encyclopedia of Statistical Science. Springer-Verlag, pp 1170-1177 · Zbl 1124.68021
[37] Meila M, Bao L (2008) Estimation and Clustering with Infinite Rankings. In: Uncertainty in Artificial Intelligence (UAI). AUAI Press, Corvallis, Oregon, pp 393-402 · Zbl 0658.62031
[38] Meila M, Chen H (2010) Dirichlet process mixtures of generalized mallows models. In: Uncertainty in Artificial Intelligence (UAI), pp 285-294 · Zbl 1429.62258
[39] Mladenović N, Hansen P (1997) Variable neighborhood search. Comput Oper Res 24(11):1097-1100 · Zbl 0889.90119 · doi:10.1016/S0305-0548(97)00031-2
[40] Murphy TB, Martin D (2003) Mixtures of distance-based models for ranking data. Comput Stat Data Anal 41(3-4):645-655 · Zbl 1429.62258 · doi:10.1016/S0167-9473(02)00165-2
[41] Plackett RL (1975) The analysis of permutations. J R Stat Soc 24(10):193-202
[42] Popov V (2007) Multiple genome rearrangement by swaps and by element duplications. Theor Comput Sci 385(1-3):115-126 · Zbl 1124.68021 · doi:10.1016/j.tcs.2007.05.029
[43] Paige ER, Rockmore DN (2011) A Mallows model for Coxeter groups and buildings. PhD thesis, Datmouth college. Hanover, New Hampshire
[44] Schadrm M (ed.) (1991) Analyzing and modeling data and knowledge Chapman & Hall
[45] Starr S (2009) Thermodynamic limit for the Mallows model on S_n. J Math Phys 50(9):95208 · Zbl 1241.82039 · doi:10.1063/1.3156746
[46] Tavaré S, Ewens WJ (1997) Multivariate Ewens distribution, chapter 41, pages 232-246 John Wiley & Sons
[47] Thurstone LL (1927) A law of comparative judgment. Psychol Rev 34(4):273-286 · doi:10.1037/h0070288
[48] Ziegler A, Christiansen E, Kriegman D, Belongie S (2012) Locally uniform comparison image descriptor. In: Advances in Neural Information Processing Systems (NIPS), pp 1-9
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.