×

Making sense of economics datasets with evolutionary coresets. (English) Zbl 1442.62729

Bucciarelli, Edgardo (ed.) et al., Decision economics: complexity of decisions and decisions for complexity. Papers based on the presentations at the international conference on decision economics, DECON 2019, Ávila, Spain, June 26–28, 2019. Cham: Springer. Adv. Intell. Syst. Comput. 1009, 162-170 (2020).
Summary: Machine learning agents learn to take decisions extracting information from training data. When similar inferences can be obtained using a small subset of the same training set of samples, the subset is called coreset. Coresets discovery is an active line of research as it may be used to reduce the training speed as well as to allow human experts to gain a better understanding of both the phenomenon and the decisions, by reducing the number of samples to be examined. For classification problems, the state-of-the-art in coreset discovery is EvoCore, a multi-objective evolutionary algorithm. In this work EvoCore is exploited both on synthetic and on real data sets, showing how coresets may be useful in explaining decisions taken by machine learning classifiers.
For the entire collection see [Zbl 1444.91005].

MSC:

62P20 Applications of statistics to economics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Panetta, K.: Top trends in the gartner hype cycle for emerging technologies, 2017. Gartner 5, 1-5 (2017)
[2] Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530, 2016
[3] Bachem, O., Lucic, M., Krause, A.: Practical coreset constructions for machine learning. arXiv preprint arXiv:1703.06476 (2017)
[4] Barbiero, P., Tonda, A.: Fundamental flowers: evolutionary discovery of coresets for classification. In: Applications of Evolutionary Computation - 22nd International Conference EvoApplications TBA (2019) · doi:10.1007/978-3-030-16692-2_37
[5] Barbiero, P., Alberto, T.: Evolutionary discovery of coresets for machine learning. In: The Genetic and Evolutionary Computation Conference (GECCO), July 2019
[6] Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.A.M.T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182-197 (2002) · doi:10.1109/4235.996017
[7] Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210-229 (1959) · doi:10.1147/rd.33.0210
[8] Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press (1984) · Zbl 0541.62042
[9] Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Society. Ser. B (Methodol.) 20, 215-242 (1958) · Zbl 0088.35703
[10] Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016) · Zbl 1373.68009
[11] Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(Apr), 363-392 (2005) · Zbl 1222.68320
[12] Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195-198 (1943)
[13] Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 13(4), 18-28 (1998) · doi:10.1109/5254.708428
[14] Breiman, L.: Pasting small votes for classification in large databases and on-line. Mach. Learn. 36(1-2), 85-103 (1999) · doi:10.1023/A:1007563306331
[15] Breiman, L.: Random forests. Mach. Learn. 45(1), 5-32 (2001) · Zbl 1007.68152 · doi:10.1023/A:1010933404324
[16] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825-2830 (2011) · Zbl 1280.68189
[17] Garrett, A.: inspyred (version 1.0.1) inspired intelligence (2012). https://github.com/aarongarrett/inspyred
[18] Campbell, T., Broderick, T.: Bayesian coreset construction via greedy iterative geodesic ascent. In: International Conference on Machine Learning (ICML) (2018)
[19] Clarkson, K.L.: Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Trans. Algorithms 6, 63 (2010) · Zbl 1300.90026 · doi:10.1145/1824777.1824783
[20] Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of 27th Asilomar Conference on Signals, Systems and Computers, pp. 40-44 (1993)
[21] Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407-451 (2004) · Zbl 1091.62054 · doi:10.1214/009053604000000067
[22] Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near-optimal coresets for least-squares regression. Technical report (2013) · Zbl 1364.62170
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.