-
Metaheuristics "In the Large"
Authors:
Jerry Swan,
Steven Adriaensen,
Alexander E. I. Brownlee,
Kevin Hammond,
Colin G. Johnson,
Ahmed Kheiri,
Faustyna Krawiec,
J. J. Merelo,
Leandro L. Minku,
Ender Özcan,
Gisele L. Pappa,
Pablo García-Sánchez,
Kenneth Sörensen,
Stefan Voß,
Markus Wagner,
David R. White
Abstract:
Following decades of sustained improvement, metaheuristics are one of the great success stories of optimization research. However, in order for research in metaheuristics to avoid fragmentation and a lack of reproducibility, there is a pressing need for stronger scientific and computational infrastructure to support the development, analysis and comparison of new approaches. We argue that, via pri…
▽ More
Following decades of sustained improvement, metaheuristics are one of the great success stories of optimization research. However, in order for research in metaheuristics to avoid fragmentation and a lack of reproducibility, there is a pressing need for stronger scientific and computational infrastructure to support the development, analysis and comparison of new approaches. We argue that, via principled choice of infrastructure support, the field can pursue a higher level of scientific enquiry. We describe our vision and report on progress, showing how the adoption of common protocols for all metaheuristics can help liberate the potential of the field, easing the exploration of the design space of metaheuristics.
△ Less
Submitted 3 June, 2021; v1 submitted 19 November, 2020;
originally announced November 2020.
-
An Extensive Experimental Evaluation of Automated Machine Learning Methods for Recommending Classification Algorithms (Extended Version)
Authors:
Márcio P. Basgalupp,
Rodrigo C. Barros,
Alex G. C. de Sá,
Gisele L. Pappa,
Rafael G. Mantovani,
André C. P. L. F. de Carvalho,
Alex A. Freitas
Abstract:
This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA…
▽ More
This paper presents an experimental comparison among four Automated Machine Learning (AutoML) methods for recommending the best classification algorithm for a given input dataset. Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method based on the Combined Algorithm Selection and Hyper-parameter optimisation (CASH) approach. The EA-based methods build classification algorithms from a single machine learning paradigm: either decision-tree induction, rule induction, or Bayesian network classification. Auto-WEKA combines algorithm selection and hyper-parameter optimisation to recommend classification algorithms from multiple paradigms. We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit. In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant. However, the EA evolving decision-tree induction algorithms has the advantage of producing algorithms that generate interpretable classification models and that are more scalable to large datasets, by comparison with many algorithms from other learning paradigms that can be recommended by Auto-WEKA. We also observed that Auto-WEKA has shown meta-overfitting, a form of overfitting at the meta-learning level, rather than at the base-learning level.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
Neural Architecture Search in Graph Neural Networks
Authors:
Matheus Nunes,
Gisele L. Pappa
Abstract:
Performing analytical tasks over graph data has become increasingly interesting due to the ubiquity and large availability of relational information. However, unlike images or sentences, there is no notion of sequence in networks. Nodes (and edges) follow no absolute order, and it is hard for traditional machine learning (ML) algorithms to recognize a pattern and generalize their predictions on th…
▽ More
Performing analytical tasks over graph data has become increasingly interesting due to the ubiquity and large availability of relational information. However, unlike images or sentences, there is no notion of sequence in networks. Nodes (and edges) follow no absolute order, and it is hard for traditional machine learning (ML) algorithms to recognize a pattern and generalize their predictions on this type of data. Graph Neural Networks (GNN) successfully tackled this problem. They became popular after the generalization of the convolution concept to the graph domain. However, they possess a large number of hyperparameters and their design and optimization is currently hand-made, based on heuristics or empirical intuition. Neural Architecture Search (NAS) methods appear as an interesting solution to this problem. In this direction, this paper compares two NAS methods for optimizing GNN: one based on reinforcement learning and a second based on evolutionary algorithms. Results consider 7 datasets over two search spaces and show that both methods obtain similar accuracies to a random search, raising the question of how many of the search space dimensions are actually relevant to the problem.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
A Robust Experimental Evaluation of Automated Multi-Label Classification Methods
Authors:
Alex G. C. de Sá,
Cristiano G. Pimenta,
Gisele L. Pappa,
Alex A. Freitas
Abstract:
Automated Machine Learning (AutoML) has emerged to deal with the selection and configuration of algorithms for a given learning task. With the progression of AutoML, several effective methods were introduced, especially for traditional classification and regression problems. Apart from the AutoML success, several issues remain open. One issue, in particular, is the lack of ability of AutoML method…
▽ More
Automated Machine Learning (AutoML) has emerged to deal with the selection and configuration of algorithms for a given learning task. With the progression of AutoML, several effective methods were introduced, especially for traditional classification and regression problems. Apart from the AutoML success, several issues remain open. One issue, in particular, is the lack of ability of AutoML methods to deal with different types of data. Based on this scenario, this paper approaches AutoML for multi-label classification (MLC) problems. In MLC, each example can be simultaneously associated to several class labels, unlike the standard classification task, where an example is associated to just one class label. In this work, we provide a general comparison of five automated multi-label classification methods -- two evolutionary methods, one Bayesian optimization method, one random search and one greedy search -- on 14 datasets and three designed search spaces. Overall, we observe that the most prominent method is the one based on a canonical grammar-based genetic programming (GGP) search method, namely Auto-MEKA$_{GGP}$. Auto-MEKA$_{GGP}$ presented the best average results in our comparison and was statistically better than all the other methods in different search spaces and evaluated measures, except when compared to the greedy search method.
△ Less
Submitted 31 July, 2020; v1 submitted 16 May, 2020;
originally announced May 2020.
-
Multi-label classification search space in the MEKA software
Authors:
Alex G. C. de Sá,
Cristiano G. Pimenta,
Gisele L. Pappa,
Alex A. Freitas
Abstract:
This supplementary material aims to describe the proposed multi-label classification (MLC) search spaces based on the MEKA and WEKA softwares. First, we overview 26 MLC algorithms and meta-algorithms in MEKA, presenting their main characteristics, such as hyper-parameters, dependencies and constraints. Second, we review 28 single-label classification (SLC) algorithms, preprocessing algorithms and…
▽ More
This supplementary material aims to describe the proposed multi-label classification (MLC) search spaces based on the MEKA and WEKA softwares. First, we overview 26 MLC algorithms and meta-algorithms in MEKA, presenting their main characteristics, such as hyper-parameters, dependencies and constraints. Second, we review 28 single-label classification (SLC) algorithms, preprocessing algorithms and meta-algorithms in the WEKA software. These SLC algorithms were also studied because they are part of the proposed MLC search spaces. Fundamentally, this occurs due to the problem transformation nature of several MLC algorithms used in this work. These algorithms transform an MLC problem into one or several SLC problems in the first place and solve them with SLC model(s) in a next step. Therefore, understanding their main characteristics is crucial to this work. Finally, we present a formal description of the search spaces by proposing a context-free grammar that encompasses the 54 learning algorithms. This grammar basically comprehends the possible combinations, the constraints and dependencies among the learning algorithms.
△ Less
Submitted 31 July, 2020; v1 submitted 27 November, 2018;
originally announced November 2018.
-
Analysing Symbolic Regression Benchmarks under a Meta-Learning Approach
Authors:
Luiz Otavio Vilas Boas Oliveira,
Joao Francisco Barreto da Silva Martins,
Luis Fernando Miranda,
Gisele Lobo Pappa
Abstract:
The definition of a concise and effective testbed for Genetic Programming (GP) is a recurrent matter in the research community. This paper takes a new step in this direction, proposing a different approach to measure the quality of the symbolic regression benchmarks quantitatively. The proposed approach is based on meta-learning and uses a set of dataset meta-features---such as the number of examp…
▽ More
The definition of a concise and effective testbed for Genetic Programming (GP) is a recurrent matter in the research community. This paper takes a new step in this direction, proposing a different approach to measure the quality of the symbolic regression benchmarks quantitatively. The proposed approach is based on meta-learning and uses a set of dataset meta-features---such as the number of examples or output skewness---to describe the datasets. Our idea is to correlate these meta-features with the errors obtained by a GP method. These meta-features define a space of benchmarks that should, ideally, have datasets (points) covering different regions of the space. An initial analysis of 63 datasets showed that current benchmarks are concentrated in a small region of this benchmark space. We also found out that number of instances and output skewness are the most relevant meta-features to GP output error. Both conclusions can help define which datasets should compose an effective testbed for symbolic regression methods.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.
-
Solving the Exponential Growth of Symbolic Regression Trees in Geometric Semantic Genetic Programming
Authors:
Joao Francisco B. S. Martins,
Luiz Otavio V. B. Oliveira,
Luis F. Miranda,
Felipe Casadei,
Gisele L. Pappa
Abstract:
Advances in Geometric Semantic Genetic Programming (GSGP) have shown that this variant of Genetic Programming (GP) reaches better results than its predecessor for supervised machine learning problems, particularly in the task of symbolic regression. However, by construction, the geometric semantic crossover operator generates individuals that grow exponentially with the number of generations, resu…
▽ More
Advances in Geometric Semantic Genetic Programming (GSGP) have shown that this variant of Genetic Programming (GP) reaches better results than its predecessor for supervised machine learning problems, particularly in the task of symbolic regression. However, by construction, the geometric semantic crossover operator generates individuals that grow exponentially with the number of generations, resulting in solutions with limited use. This paper presents a new method for individual simplification named GSGP with Reduced trees (GSGP-Red). GSGP-Red works by expanding the functions generated by the geometric semantic operators. The resulting expanded function is guaranteed to be a linear combination that, in a second step, has its repeated structures and respective coefficients aggregated. Experiments in 12 real-world datasets show that it is not only possible to create smaller and completely equivalent individuals in competitive computational time, but also to reduce the number of nodes composing them by 58 orders of magnitude, on average.
△ Less
Submitted 18 April, 2018;
originally announced April 2018.
-
How Noisy Data Affects Geometric Semantic Genetic Programming
Authors:
Luis F. Miranda,
Luiz Otavio V. B. Oliveira,
Joao Francisco B. S. Martins,
Gisele L. Pappa
Abstract:
Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-a…
▽ More
Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.
△ Less
Submitted 4 July, 2017;
originally announced July 2017.
-
Enhancement of Epidemiological Models for Dengue Fever Based on Twitter Data
Authors:
Julio Albinati,
Wagner Meira Jr.,
Gisele L. Pappa,
Mauro Teixeira,
Cecilia Marques-Toledo
Abstract:
Epidemiological early warning systems for dengue fever rely on up-to-date epidemiological data to forecast future incidence. However, epidemiological data typically requires time to be available, due to the application of time-consuming laboratorial tests. This implies that epidemiological models need to issue predictions with larger antecedence, making their task even more difficult. On the other…
▽ More
Epidemiological early warning systems for dengue fever rely on up-to-date epidemiological data to forecast future incidence. However, epidemiological data typically requires time to be available, due to the application of time-consuming laboratorial tests. This implies that epidemiological models need to issue predictions with larger antecedence, making their task even more difficult. On the other hand, online platforms, such as Twitter or Google, allow us to obtain samples of users' interaction in near real-time and can be used as sensors to monitor current incidence. In this work, we propose a framework to exploit online data sources to mitigate the lack of up-to-date epidemiological data by obtaining estimates of current incidence, which are then explored by traditional epidemiological models. We show that the proposed framework obtains more accurate predictions than alternative approaches, with statistically better results for delays greater or equal to 4 weeks.
△ Less
Submitted 22 May, 2017;
originally announced May 2017.
-
Selective Harvesting over Networks
Authors:
Fabricio Murai,
Diogo Rennó,
Bruno Ribeiro,
Gisele L. Pappa,
Don Towsley,
Krista Gile
Abstract:
Active search (AS) on graphs focuses on collecting certain labeled nodes (targets) given global knowledge of the network topology and its edge weights under a query budget. However, in most networks, nodes, topology and edge weights are all initially unknown. We introduce selective harvesting, a variant of AS where the next node to be queried must be chosen among the neighbors of the current queri…
▽ More
Active search (AS) on graphs focuses on collecting certain labeled nodes (targets) given global knowledge of the network topology and its edge weights under a query budget. However, in most networks, nodes, topology and edge weights are all initially unknown. We introduce selective harvesting, a variant of AS where the next node to be queried must be chosen among the neighbors of the current queried node set; the available training data for deciding which node to query is restricted to the subgraph induced by the queried set (and their node attributes) and their neighbors (without any node or edge attributes). Therefore, selective harvesting is a sequential decision problem, where we must decide which node to query at each step. A classifier trained in this scenario suffers from a tunnel vision effect: without recourse to independent sampling, the urge to query promising nodes forces classifiers to gather increasingly biased training data, which we show significantly hurts the performance of AS methods and standard classifiers. We find that it is possible to collect a much larger set of targets by using multiple classifiers, not by combining their predictions as an ensemble, but switching between classifiers used at each step, as a way to ease the tunnel vision effect. We discover that switching classifiers collects more targets by (a) diversifying the training data and (b) broadening the choices of nodes that can be queried next. This highlights an exploration, exploitation, and diversification trade-off in our problem that goes beyond the exploration and exploitation duality found in classic sequential decision problems. From these observations we propose D3TS, a method based on multi-armed bandits for non-stationary stochastic processes that enforces classifier diversity, matching or exceeding the performance of competing methods on seven real network datasets in our evaluation.
△ Less
Submitted 15 March, 2017;
originally announced March 2017.
-
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /r/loseit
Authors:
Tiago O. Cunha,
Ingmar Weber,
Gisele L. Pappa
Abstract:
Social feedback has long been recognized as an important element of successful health-related behavior change. However, most of the existing studies look at the effect that offline social feedback has. This paper fills gaps in the literature by proposing a framework to study the causal effect that receiving social support in the form of comments in an online weight loss community has on (i) the pr…
▽ More
Social feedback has long been recognized as an important element of successful health-related behavior change. However, most of the existing studies look at the effect that offline social feedback has. This paper fills gaps in the literature by proposing a framework to study the causal effect that receiving social support in the form of comments in an online weight loss community has on (i) the probability of the user to return to the forum, and, more importantly, on (ii) the weight loss reported by the user. Using a matching approach for causal inference we observe a difference of 9 lbs lost between users who do or do not receive comments. Surprisingly, this effect is mediated by neither an increase in lifetime in the community nor by an increased activity level of the user. Our results show the importance that a "warm welcome" has when using online support forums to achieve health outcomes.
△ Less
Submitted 18 January, 2017;
originally announced January 2017.
-
An Accurate Gaussian Process-Based Early Warning System for Dengue Fever
Authors:
Julio Albinati,
Wagner Meira Jr,
Gisele Lobo Pappa
Abstract:
Dengue fever is a mosquito-borne disease present in all Brazilian territory. Brazilian government, however, lacks an accurate early warning system to quickly predict future dengue outbreaks. Such system would help health authorities to plan their actions and to reduce the impact of the disease in the country. However, most attempts to model dengue fever use parametric models which enforce a specif…
▽ More
Dengue fever is a mosquito-borne disease present in all Brazilian territory. Brazilian government, however, lacks an accurate early warning system to quickly predict future dengue outbreaks. Such system would help health authorities to plan their actions and to reduce the impact of the disease in the country. However, most attempts to model dengue fever use parametric models which enforce a specific expected behaviour and fail to capture the inherent complexity of dengue dynamics. Therefore, we propose a new Bayesian non-parametric model based on Gaussian processes to design an accurate and flexible model that outperforms previous/standard techniques and can be incorporated into an early warning system, specially at cities from Southeast and Center-West regions. The model also helps understanding dengue dynamics in Brazil through the analysis of the covariance functions generated.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
The Effect of Social Feedback in a Reddit Weight Loss Community
Authors:
Tiago O. Cunha,
Ingmar Weber,
Hamed Haddadi,
Gisele L. Pappa
Abstract:
It is generally accepted as common wisdom that receiving social feedback is helpful to (i) keep an individual engaged with a community and to (ii) facilitate an individual's positive behavior change. However, quantitative data on the effect of social feedback on continued engagement in an online health community is scarce. In this work we apply Mahalanobis Distance Matching (MDM) to demonstrate th…
▽ More
It is generally accepted as common wisdom that receiving social feedback is helpful to (i) keep an individual engaged with a community and to (ii) facilitate an individual's positive behavior change. However, quantitative data on the effect of social feedback on continued engagement in an online health community is scarce. In this work we apply Mahalanobis Distance Matching (MDM) to demonstrate the importance of receiving feedback in the "loseit" weight loss community on Reddit. Concretely we show that (i) even when correcting for differences in word choice, users receiving more positive feedback on their initial post are more likely to return in the future, and that (ii) there are diminishing returns and social feedback on later posts is less important than for the first post. We also give a description of the type of initial posts that are more likely to attract this valuable social feedback. Though we cannot yet argue about ultimate weight loss success or failure, we believe that understanding the social dynamics underlying online health communities is an important step to devise more effective interventions.
△ Less
Submitted 25 February, 2016;
originally announced February 2016.