×

Learning context-dependent choice functions. (English) Zbl 07460573

Summary: Choice functions accept a set of alternatives as input and produce a preferred subset of these alternatives as output. We study the problem of learning such functions under conditions of context-dependence of preferences, which means that the preference in favor of a certain choice alternative may depend on what other options are also available. In spite of its practical relevance, this kind of context-dependence has received little attention in preference learning so far. We propose a suitable model based on context-dependent (latent) utility functions, thereby reducing the problem to the task of learning such utility functions. Practically, this comes with a number of challenges. For example, the set of alternatives provided as input to a choice function can be of any size, and the output of the function should not depend on the order in which the alternatives are presented. To meet these requirements, we propose two general approaches based on two representations of context-dependent utility functions, as well as instantiations in the form of appropriate end-to-end trainable neural network architectures. Moreover, to demonstrate the performance of both networks, we present extensive empirical evaluations on both synthetic and real-world datasets.

MSC:

68T37 Reasoning under uncertainty in the context of artificial intelligence

References:

[1] Aggarwal, C. C.; Yu, P. S., Outlier detection for high dimensional data, (ACM Sigmod Record (2001), ACM), 37-46
[2] Ai, Q.; Bi, K.; Guo, J.; Croft, W. B., Learning a deep listwise context model for ranking refinement, (SIGIR (2018), ACM), 135-144
[3] Ai, Q.; Wang, X.; Bruch, S.; Golbandi, N.; Bendersky, M.; Najork, M., Learning groupwise multivariate scoring functions using deep neural networks, (ICTIR (2019), ACM), 85-92
[4] Alvarez, P. A.; Ishizaka, A.; Martínez, L., Multiple-criteria decision-making sorting methods: a survey, Expert Syst. Appl., 183, Article 115368 pp. (2021)
[5] Ambrus, A.; Rozen, K., Rationalising choice with multi-self models, Econ. J., 125, 1136-1156 (2014)
[6] Arrow, K. J., Social Choice and Individual Values (1951), John Wiley & Sons · Zbl 0984.91513
[7] Batsell, R. R.; Polking, J. C., A new class of market share models, Mark. Sci., 4, 177-198 (1985)
[8] Battaglia, P. W.; Hamrick, J. B.; Bapst, V.; Sanchez-Gonzalez, A.; Zambaldi, V. F.; Malinowski, M.; Tacchetti, A.; Raposo, D.; Santoro, A.; Faulkner, R.; Gülçehre, Ç.; Song, F.; Ballard, A. J.; Gilmer, J.; Dahl, G. E.; Vaswani, A.; Allen, K.; Nash, C.; Langston, V.; Dyer, C.; Heess, N.; Wierstra, D.; Kohli, P.; Botvinick, M.; Vinyals, O.; Li, Y.; Pascanu, R., Relational inductive biases, deep learning, and graph networks (2018), CoRR
[9] Ben-Akiva, M.; Lerman, S. R., Discrete Choice Analysis: Theory and Application to Travel Demand, Vol. 9 (1985), MIT Press
[10] Benson, A. R.; Kumar, R.; Tomkins, A., A discrete choice model for subset selection, (Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (2018), ACM), 37-45
[11] Bentham, J., An Introduction to the Principles of Morals and Legislation (1789), T. Payne, and Son: T. Payne, and Son London
[12] Berkson, J., Application of the logistic function to bio-assay, J. Am. Stat. Assoc., 39, 357-365 (1944)
[13] Bettman, J. R.; Luce, M. F.; Payne, J. W., Constructive consumer choice processes, J. Consum. Res., 25, 187-217 (1998)
[14] Bischl, B.; Kerschke, P.; Kotthoff, L.; Lindauer, M.; Malitsky, Y.; Fréchette, A.; Hoos, H.; Hutter, F.; Leyton-Brown, K.; Tierney, K.; Vanschoren, J., ASlib: a benchmark library for algorithm selection, Artif. Intell., 237, 41-58 (2016) · Zbl 1357.68202
[15] Bower, A.; Balzano, L., Preference modeling with context-dependent salient features, (ICML (2020), PMLR), 1067-1077
[16] Bradley, R. A.; Terry, M. E., Rank analysis of incomplete block designs: I. The method of paired comparisons, Biometrika, 39, 324-345 (1952) · Zbl 0047.12903
[17] Bringmann, K.; Friedrich, T., Approximating the volume of unions and intersections of high-dimensional geometric objects, Comput. Geom., 43, 601-610 (2010) · Zbl 1206.65072
[18] Burges, C. J.C.; Shaked, T.; Renshaw, E.; Lazier, A.; Deeds, M.; Hamilton, N.; Hullender, G. N., Learning to rank using gradient descent, (ICML (2005), ACM), 89-96
[19] Chakravarti, D.; Lynch, J. G., A framework for exploring context effects on consumer judgment and choice, Adv. Consum. Res., 10, 289-297 (1983)
[20] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection: a survey, ACM Comput. Surv., 41, 15 (2009)
[21] Chen, S.; Joachims, T., Modeling intransitivity in matchup and comparison data, (WSDM (2016), ACM), 227-236
[22] Chollet, F., Keras (2017)
[23] Cox, D. R., Some procedures connected with the logistic qualitative response curve, (Research Papers in Statistics Festschrift for J. Neuman (1966)), 55-71 · Zbl 0158.17808
[24] Debreu, G., Representation of a preference ordering by a numerical function, Organ. Behav. Hum. Decis. Process., 3, 159-165 (1954) · Zbl 0058.13803
[25] Debreu, G., Theory of Value: An Axiomatic Analysis of Economic Equilibrium (1959), Yale University Press · Zbl 0193.20205
[26] Debreu, G., Review of R. D. Luce, individual choice behavior: a theoretical analysis, Am. Econ. Rev., 50, 186-188 (1960)
[27] Dembczynski, K.; Waegeman, W.; Cheng, W.; Hüllermeier, E., On label dependence and loss minimization in multi-label classification, Mach. Learn., 88, 5-45 (2012) · Zbl 1243.68237
[28] Diamond, J.; Evans, W., The correction for guessing, Rev. Educ. Res., 43, 181-191 (1973)
[29] Dogan, Ü.; Glasmachers, T.; Igel, C., A unified view on multi-class support vector classification, J. Mach. Learn. Res., 17, 1-32 (2016) · Zbl 1360.68669
[30] Domshlak, C.; Hüllermeier, E.; Kaci, S.; Prade, H., Preferences in AI: an overview, Artif. Intell., 175, 1037-1052 (2011)
[31] Doyle, J. R.; O’Connor, D. J.; Reynolds, G. M.; Bottomley, P. A., The robustness of the asymmetrically dominated effect: buying frames, phantom alternatives, and in-store purchases, Psychol. Mark., 16, 225-243 (1999)
[32] Duchi, J.; Hazan, E.; Singer, Y., Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res., 12, 2121-2159 (2011) · Zbl 1280.68164
[33] Evgeniou, T.; Boussios, C.; Zacharia, G., Generalized robust conjoint estimation, Mark. Sci., 24, 415-429 (2005)
[34] Expedia, Expedia hotel recommendations (2016)
[35] Fahandar, M. A.; Hüllermeier, E.; Couso, I., Statistical inference for incomplete ranking data: the case of rank-dependent coarsening, (ICMLICML (2017), PMLR), 1078-1087
[36] Fawcett, T., An introduction to ROC analysis, Pattern Recognit. Lett., 27, 861-874 (2006)
[37] Fechner, G. T., Elemente der Psychophysik, Vol. 2 (1860), Breitkopf u. Härtel
[38] Fligner, M. A.; Verducci, J. S., Distance based ranking models, J. R. Stat. Soc., Ser. B, Methodol., 48, 359-369 (1986) · Zbl 0658.62031
[39] Fligner, M. A.; Verducci, J. S., Multistage ranking models, J. Am. Stat. Assoc., 83, 892-901 (1988) · Zbl 0719.62036
[40] Fudenberg, D.; Levine, D. K., A dual-self model of impulse control, Am. Econ. Rev., 96, 1449-1476 (2006)
[41] (Fürnkranz, J.; Hüllermeier, E., Preference Learning (2010), Springer) · Zbl 1201.68006
[42] Geilen, M.; Basten, T.; Theelen, B.; Otten, R., An algebra of Pareto points, Fundam. Inform., 78, 35-74 (2007) · Zbl 1135.90397
[43] Glorot, X.; Bengio, Y., Understanding the difficulty of training deep feedforward neural networks, (AISTATS (2010), JMLR), 249-256
[44] Goodfellow, I.; Bengio, Y.; Courville, A., Deep Learning (2016), MIT Press · Zbl 1373.68009
[45] Grabisch, M.; Marichal, J.; Mesiar, R.; Pap, E., Aggregation Functions (2009), Cambridge University Press · Zbl 1196.00002
[46] Green, J.; Hojman, D., Choice, Rationality and Welfare Measurement (2007), Harvard University, KSG Faculty Research Working Paper Series RWP07-054
[47] Gurland, J.; Lee, I.; Dahm, P. A., Polychotomous quantal response in biological assay, Biometrics, 16, 382-398 (1960) · Zbl 0119.15703
[48] Harper, F. M.; Konstan, J. A., The MovieLens datasets: history and context, ACM Trans. Interact. Intell. Syst., 5 (2015)
[49] He, K.; Zhang, X.; Ren, S.; Sun, J., Delving deep into rectifiers: surpassing human-level performance on ImageNet classification, (ICCV (2015), IEEE Computer Society), 1026-1034
[50] Head, T.; MechCoder; Louppe, G.; Shcherbatyi, I.; fcharras; Vinícius, Z.; cmmalone; Schröder, C.; nel215; Campos, N.; Young, T.; Cereda, S.; Fan, T.; rene-rex; Shi, K. K.; Schwabedal, J.; carlosdanielcsantos; Hvass-Labs; Pak, M.; SoManyUsernamesTaken; Callaway, F.; Estève, L.; Besson, L.; Cherti, M.; Pfannschmidt, K.; Linzberger, F.; Cauet, C.; Gut, A.; Mueller, A.; Fabisch, A., scikit-optimize/scikit-optimize, v0.5.2 (2018)
[51] Houthakker, H. S., Revealed preference and the utility function, Economica, 17, 159-174 (1950)
[52] Huber, J.; Payne, J. W.; Puto, C., Adding asymmetrically dominated alternatives: violations of regularity and the similarity hypothesis, J. Consum. Res., 9, 90-98 (1982)
[53] Huber, J.; Puto, C., Market boundaries and product choice: illustrating attraction and substitution effects, J. Consum. Res., 10, 31-44 (1983)
[54] Ioffe, S.; Szegedy, C., Batch normalization: accelerating deep network training by reducing internal covariate shift, (ICML (2015), JMLR), 448-456
[55] Kalai, G.; Rubinstein, A.; Spiegler, R., Rationalizing choice functions by multiple rationales, Econometrica, 70, 2481-2488 (2002) · Zbl 1130.91326
[56] Kamakura, W. A.; Srivastava, R. K., Predicting choice shares under conditions of brand interdependence, J. Mark. Res., 21, 420-434 (1984)
[57] Kamishima, T., Nantonac collaborative filtering: recommendation based on order responses, (KDD (2003), ACM), 583-588
[58] Kamishima, T.; Kazawa, H.; Akaho, S., A survey and empirical comparison of object ranking methods, (Preference Learning (2010), Springer), 181-201 · Zbl 1213.68495
[59] Kelly, J. S.; Hall, M., Impossibility results with resoluteness, Econ. Lett., 34, 15-19 (1990) · Zbl 0715.90012
[60] Kelman, M.; Rottenstreich, Y.; Tversky, A., Context-dependence in legal decision making, J. Leg. Stud., 25, 287-318 (1996)
[61] Kivetz, R.; Netzer, O.; Srinivasan, V., Alternative models for capturing the compromise effect, J. Mark. Res., 41, 237-257 (2004)
[62] Klambauer, G.; Unterthiner, T.; Mayr, A.; Hochreiter, S., Self-normalizing neural networks, (NIPS, Curran Associates Inc (2017)), 972-981
[63] Kleinberg, J. M.; Mullainathan, S.; Ugander, J., Comparison-based choices, (EC (2017), ACM), 127-144
[64] Koyejo, O.; Natarajan, N.; Ravikumar, P.; Dhillon, I. S., Consistent multilabel classification, (NIPS (2015), MIT Press), 3321-3329
[65] Van der Laan, M.; Pollard, K.; Bryan, J., A new partitioning around medoids algorithm, J. Stat. Comput. Simul., 73, 575-584 (2003) · Zbl 1054.62075
[66] LeCun, Y.; Cortes, C.; Burges, C. J.C., The MNIST database of handwritten digits (2010)
[67] Lewis, D. D., Evaluating and optimizing autonomous text classification systems, (SIGIR (1995), ACM Press), 246-254
[68] Luce, R. D., Individual Choice Behavior (1959), John Wiley · Zbl 0093.31708
[69] Maldonado, S.; Montoya, R.; Weber, R., Advanced conjoint analysis using feature selection via support vector machines, Eur. J. Oper. Res., 241, 564-574 (2015) · Zbl 1339.91077
[70] Mallows, C. L., Non-null ranking models. I, Biometrika, 44, 114-130 (1957) · Zbl 0087.34001
[71] Mantel, N., Models for complex contingency tables and polychotomous dosage response curves, Biometrics, 22, 83-95 (1966)
[72] Manzini, P.; Mariotti, M., Sequentially rationalizable choice, Am. Econ. Rev., 97, 1824-1839 (2007)
[73] Markowitz, H., Portfolio selection, J. Finance, 7, 77-91 (1952)
[74] May, K. O., Intransitivity, utility, and the aggregation of preference patterns, Econometrica, 22, 1-13 (1954)
[75] McClish, D. K., Analyzing a portion of the ROC curve, Med. Decis. Mak., 9, 190-195 (1989)
[76] McFadden, D., Conditional logit analysis of qualitative choice behavior, (Frontiers in Econometrics (1974), Academic Press), 105-142
[77] McFadden, D.; Train, K., Mixed MNL models for discrete response, J. Appl. Econom., 15, 447-470 (2000)
[78] Mellers, B. A.; Birnbaum, M. H., Contextual effects in social judgment, J. Exp. Soc. Psychol., 19, 157-171 (1983)
[79] Moore, R.; DeNero, J., L1 and L2 regularization for multiclass hinge loss models, (MLSLP (2011)), 1-5
[80] Nesterov, Y., A method of solving a convex programming problem with convergence rate \(\mathcal{O}(1 / k^2)\), (Soviet Mathematics Doklady (1983)), 372-376 · Zbl 0535.90071
[81] von Neumann, J.; Morgenstern, O., Theory of Games and Economic Behavior (1944), Princeton University Press · Zbl 0063.05930
[82] Orhun, A. Y., Optimal product line design when consumers exhibit choice set-dependent preferences, Mark. Sci., 28, 868-886 (2009)
[83] Ozkes, A. I.; Sanver, M. R., Anonymous, neutral, and resolute social choice revisited, Soc. Choice Welf., 57, 97-113 (2021) · Zbl 1479.91115
[84] Payne, J. W.; Bettman, J. R.; Johnson, E. J., Behavioral decision research: a constructive processing perspective, Annu. Rev. Psychol., 43, 87-131 (1992)
[85] Payne, J. W.; Bettman, J. R.; Schkade, D. A.; Schwarz, N.; Gregory, R., Measuring constructed preferences: towards a building code, (Elicitation of Preferences (1999), Springer), 243-275 · Zbl 0942.91026
[86] Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; Duchesnay, E., Scikit-learn: machine learning in python, J. Mach. Learn. Res., 12, 2825-2830 (2011) · Zbl 1280.68189
[87] Pfannschmidt, K.; Gupta, P.; Hüllermeier, E., Deep architectures for learning context-dependent ranking functions (2018), CoRR
[88] Plackett, R. L., The analysis of permutations, J. R. Stat. Soc., Ser. C, Appl. Stat., 24, 193-202 (1975)
[89] Powers, D. M., Recall & precision versus the bookmaker, (ICCS (2003), University of New South Wales), 529-534
[90] Powers, D. M., Evaluation: from precision, recall and f-measure to ROC, informedness, markedness & correlation, J. Mach. Learn. Technol., 2, 37-63 (2011)
[91] Qin, T.; Liu, T., Introducing LETOR 4.0 datasets (2013), CoRR
[92] Ravanbakhsh, S.; Schneider, J. G.; Póczos, B., Equivariance through parameter-sharing, (ICML (2017), PMLR), 2892-2901
[93] Rice, J. R., The algorithm selection problem, (Advances in Computers. Advances in Computers, Advances in Computers, vol. 15 (1976), Elsevier), 65-118
[94] Rieskamp, J.; Busemeyer, J. R.; Mellers, B. A., Extending the bounds of rationality: evidence and theories of preferential choice, J. Econ. Lit., 44, 631-661 (2006)
[95] Rigutini, L.; Papini, T.; Maggini, M.; Scarselli, F., SortNet: learning to rank by a neural preference function, IEEE Trans. Neural Netw., 22, 1368-1380 (2011)
[96] Rooderkerk, R. P.; Van Heerde, H. J.; Bijmolt, T. H., Incorporating context effects into a choice model, J. Mark. Res., 48, 767-780 (2011)
[97] Rosenfeld, N.; Oshiba, K.; Singer, Y., Predicting choice with set-dependent aggregation, (ICML (2020), PMLR), 8220-8229
[98] Russell, S. J.; Norvig, P., Artificial Intelligence: A Modern Approach (2020), Pearson
[99] Salvatier, J.; Wiecki, T. V.; Fonnesbeck, C., Probabilistic programming in Python using PyMC3, PeerJ. Comput. Sci., 2, e55 (2016)
[100] Samuelson, P. A., A note on the pure theory of consumer’s behaviour, Economica, 5, 61-71 (1938)
[101] Sedikides, C.; Ariely, D.; Olsen, N., Contextual and procedural determinants of partner selection: of asymmetric dominance and prominence, Social Cogn., 17, 118-139 (1999)
[102] Sen, A. K., Choice functions and revealed preference, Rev. Econ. Stud., 38, 307-317 (1971) · Zbl 0237.90004
[103] Seshadri, A.; Peysakhovich, A.; Ugander, J., Discovering context effects from raw choice data, (ICML (2019), PMLR), 5660-5669
[104] Shafir, E.; Simonson, I.; Tversky, A., Reason-based choice, Cognition, 49, 11-36 (1993)
[105] Simonson, I., Choice based on reasons: the case of attraction and compromise effects, J. Consum. Res., 16, 158-174 (1989)
[106] Simonson, I.; Tversky, A., Choice in context: tradeoff contrast and extremeness aversion, J. Mark. Res., 29, 281-295 (1992)
[107] Smith, G., Tagging: People-Powered Metadata for the Social Web. New Riders (2007)
[108] Stanley, R. P., Enumerative Combinatorics, Vol. 1 (2011), Cambridge University Press · Zbl 1247.05003
[109] Tesauro, G., Connectionist learning of expert preferences by comparison training, (NIPS (1989), Morgan Kaufmann Publishers Inc.), 99-106
[110] Theil, H., A multinomial extension of the linear logit model, Int. Econ. Rev., 10, 251-259 (1969)
[111] Tomlinson, K.; Benson, A., Choice set optimization under discrete choice models of group decisions, (ICML (2020), PMLR), 9514-9525
[112] Train, K. E., Discrete Choice Methods with Simulation (2009), Cambridge University Press · Zbl 1269.62073
[113] TREC, TREC 2007 million query track (2007)
[114] TREC, TREC 2008 million query track (2008)
[115] Tversky, A., Intransitivity of preferences, Psychol. Rev., 76, 31 (1969)
[116] Tversky, A., Elimination by aspects: a theory of choice, Psychol. Rev., 79, 281 (1972)
[117] Tversky, A.; Simonson, I., Context-dependent preferences, Manag. Sci., 39, 1179-1189 (1993) · Zbl 0800.90037
[118] Vig, J.; Sen, S.; Riedl, J., Navigating the tag genome, (IUI (2011), ACM), 93-102
[119] Vig, J.; Sen, S.; Riedl, J., The tag genome: encoding community knowledge to support novel interaction, ACM Trans. Interact. Intell. Syst., 2, 13 (2012)
[120] Vojnovic, M.; Yun, S. Y., On the Team Selection Problem (2016), Microsoft Research, Technical Report MSR-TR-2016-7
[121] Waegeman, W.; Dembczynski, K.; Jachnik, A.; Cheng, W.; Hüllermeier, E., On the Bayes-optimality of F-measure maximizers, J. Mach. Learn. Res., 15, 3333-3388 (2014)
[122] Wen, C. H.; Koppelman, F. S., The generalized nested logit model, Transp. Res., Part B, Methodol., 35, 627-641 (2001)
[123] Williams, H. C.W. L., On the formation of travel demand models and economic evaluation measures of user benefit, Environ. Plan. A, Econ. Space, 9, 285-344 (1977)
[124] Ye, N.; Chai, K. M.A.; Lee, W. S.; Chieu, H. L., Optimizing F-measure: a tale of two approaches, (ICML (2012), icml.cc / Omnipress), 1555-1562
[125] Yu, L.; Sun, B., Four types of typical discrete choice models: which are you using?, (Proceedings of 2012 IEEE International Conference on Service Operations and Logistics, and Informatics (2012)), 298-301
[126] Zaheer, M.; Kottur, S.; Ravanbakhsh, S.; Póczos, B.; Salakhutdinov, R. R.; Smola, A. J., Deep sets, (NIPS (2017), Curran Associates, Inc.), 3394-3404
[127] Zhang, Q.; Couloigner, I., A new and efficient K-medoid algorithm for spatial clustering, (ICCSA (2005), Springer-Verlag), 181-189
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.