×

Using extension sets to aggregate partial rankings in a flexible setting. (English) Zbl 1410.91165

Summary: This paper deals with the rank aggregation problem in a general setting; in particular, we approach the problem for any kind of ranking: complete or incomplete and with or without ties. The underlying idea behind our approach is to take into account the so-called extension set of a ranking, that is, the set of permutations that are compatible with the given ranking. In this way, we aim to manage the uncertainty inherent to this problem, when not all the items are ranked and/or when some items are equally preferred. We develop two approaches: a general one based on this idea, and a constrained version in which the extension set is limited by not allowing non-ranked items to be placed in an existent bucket of tied items. We test our proposal by coupling it with two different algorithms. We formalize our approaches mathematically and also carry out an extensive experimental evaluation by using 22 datasets. The results show that the use of our extension sets-based approaches to compute the precedence matrix, clearly outperforms the standard way of computing the preference matrix by only using the information explicitly provided by the rankings in the sample.

MSC:

91B06 Decision theory
62F07 Statistical ranking and selection procedures
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Ailon, N., Aggregation of partial rankings, \(p\)-ratings and top-\(m\) lists, Algorithmica, 57, 2, 284-300 (2010) · Zbl 1184.68627
[2] Ailon, N.; Charikar, M.; Newman, A., Aggregating inconsistent information: ranking and clustering, J. ACM, 55, 5, 23:1-23:27 (2008) · Zbl 1325.68102
[3] Aledo, J. A.; Gámez, J. A.; Molina, D., Tackling the rank aggregation problem with evolutionary algorithms, Appl. Math. Comput., 222, 632-644 (2013) · Zbl 1346.68151
[4] Aledo, J. A.; Gámez, J. A.; Molina, D., Using metaheuristic algorithms for parameter estimation in generalized mallows models., Appl. Soft Comput., 38, 308-320 (2016)
[5] Ali, A.; Meila, M., Experiments with Kemeny ranking: What works when?, Math. Soc. Sci., 64, 1, 28-40 (2012) · Zbl 1247.91041
[6] Arias, J.; Cózar, J., Exreport: fast, reliable and elegant reproducible research (2016), (http://exreport.jarias.es/)
[7] Bartholdi, J.; Tovey, C. A.; Trick, M. A., Voting schemes for which it can be difficult to tell who won the election, Soc. Choice Welf., 6, 2, 157-165 (1989) · Zbl 0672.90004
[8] Betzler, N.; Bredereck, R.; Niedermeier, R., Theoretical and empirical evaluation of data reduction for exact Kemeny rank aggregation, Auton. Agents Multiagent Syst., 28, 5, 721-748 (2014)
[9] Borda, J., Memoire sur les Elections au Scrutin (1781), Histoire de l’Academie Royal des Sciences
[10] Ceberio, J.; Irurozki, E.; Mendiburu, A.; Lozano, J. A., A distance-based ranking model estimation of distribution algorithm for the flowshop scheduling problem, IEEE Trans. Evol. Comput., 18, 2, 286-300 (2014)
[11] Cohen, W. W.; Schapire, R. E.; Singer, Y., Learning to order things, J. Artif. Intell. Res., 10, 243-270 (1999) · Zbl 0915.68031
[12] Coppersmith, D.; Fleischer, L.; Rudra, A., Ordering by weighted number of wins gives a good ranking for weighted tournaments, ACM Trans. Algorithms, 6, 3, 55:1-55:13 (2010) · Zbl 1300.05297
[13] Cheng, W.; Hühn, J.; Hüllermeier, E., Decision tree and instance-based learning for label ranking, in: Proceedings of the Twenty Sixth Annual International Conference on Machine Learning. in: Proceedings of the Twenty Sixth Annual International Conference on Machine Learning, ICML ’09, 161-168 (2009), ACM
[14] Conitzer, V.; Davenport, A.; Kalagnanam, J., Improved bounds for computing Kemeny rankings, Proceedings of the Twenty First National Conference on Artificial Intelligence, 620-626 (2006)
[15] Critchlow, D. E., Metric Methods for Analyzing Partially Ranked Data (1985), Springer · Zbl 0589.62041
[18] Emerson, P., The original Borda count and partial voting, Soc. Choice Welf., 40, 2, 353-358 (2013) · Zbl 1287.91050
[19] Fagin, R.; Kumar, R.; Sivakumar, D., Efficient similarity search and classification via rank aggregation, in: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. in: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD’03, 301-312 (2003), ACM
[20] Fagin, R.; Kumar, R.; Sivakumar, D., Comparing top k lists, SIAM J. Discret. Math., 17, 1, 134-160 (2003) · Zbl 1057.68075
[21] Fagin, R.; Kumar, R.; Mahdian, M.; Sivakumar, D.; Vee, E., Comparing and aggregating rankings with ties, Proceedings of the Symposium on Principles of Database Systems PODS, 47-58 (2004)
[22] Fligner, M. A.; Verducci, J. S., Distance based ranking models, J. R. Stat. Soc., 48, 3, 359-369 (1986) · Zbl 0658.62031
[23] Friedman, M., A comparison of alternative tests of significance for the problem of m rankings, Ann. Math. Stat., 11, 1, 86-92 (1940) · JFM 66.1305.08
[24] García, S.; Herrera, F., An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, J. Mach. Learn. Res., 9, 2677-2694 (2008) · Zbl 1225.68178
[25] Hemaspaandra, E.; Spakowski, H.; Vogel, J., The complexity of Kemeny elections, Theor. Comput. Sci., 349, 3, 382-391 (2005) · Zbl 1086.68046
[26] Holm, S., A simple sequentially rejective multiple test procedure, Scand. J. Stat., 6, 65-70 (1979) · Zbl 0402.62058
[27] Huang, J., Probabilistic Reasoning and Learning on Permutations: Exploiting Structural Decompositions of the Symmetric Group (2011), Carnegie Mellon University, (Ph.D. thesis)
[28] Jackson, B. N.; Schnable, P. S.; Aluru, S., Consensus genetic maps as median orders from inconsistent sources, IEEE/ACM Trans. Comput. Biol. Bioinf., 5, 2, 161-171 (2008)
[29] Kamishima, T., Nantonac collaborative filtering: recommendation based on order responses, Proceedings of the Ninth International Conference on Knowledge Discovery and Data Mining, KDD-03, 583-588 (2003)
[30] Kaur, P.; Singh, M.; Josan, G. S., Comparative analysis of rank aggregation techniques for metasearch using genetic algorithm, Educ. Inf. Technol., 1-19 (2016), In press
[31] Kemeny, J. L.; Snell, J. G., Mathematical Models in the Social Sciences (1962), Blaisdell: Blaisdell New York · Zbl 0256.92003
[32] Kendall, M. G., A new measure of rank correlation, Biometrika, 30, 1/2, 81-93 (1938) · Zbl 0019.13001
[33] Kidwell, P.; Lebanon, G.; Cleveland, W. S., Visualizing incomplete and partially ranked data, IEEE Trans. Vis. Comput. Graph., 14, 6, 1356-1363 (2008)
[35] Meila, M.; Phadnis, K.; Patterson, A.; Bilmes, J. A., Consensus ranking under the exponential model, Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, UAI-07, 285-294 (2007)
[36] Napoles, G.; Dikopoulou, Z.; Papageorgiou, E.; Bello, R.; Vanhoof, K., Aggregation of partial rankings ? An approach based on the Kemeny ranking problem, Proceedings of Thirteenth International Work-Conference on Artificial Neural Networks, IWANN 2015, 343-355 (2015)
[37] Sarkar, C.; Cooley, S.; Srivastava, J., Robust feature selection technique using rank aggregation, Appl. Artif. Intell., 28, 3, 243-257 (2014)
[39] Schalekamp, F.; van Zuylen, A., Rank aggregation: together we’re strong, Proceedings of the Workshop on Algorithm Engineering and Experiments, ALENEX, 38-51 (2009) · Zbl 1430.68476
[40] Ukkonen, A., Visualizing sets of partial rankings, Proceedings of the Seventh International Conference on Intelligent Data Analysis, IDA 2007, 40-51 (2007)
[41] Ukkonen, A.; Puolamäki, K.; Gionis, A.; Mannila, H., A randomized approximation algorithm for computing bucket orders, Inf. Process. Lett., 109, 7, 356-359 (2009) · Zbl 1191.68874
[42] Ukkonen, A., Clustering algorithms for chains, J. Mach. Learn. Res., 12, 1389-1423 (2011) · Zbl 1280.68209
[43] Wilcoxon, F., Individual comparisons by ranking methods, Biometrics, 1, 80-83 (1945)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.