Johannes P. Dürholt, Thomas S. Asche, Johanna Kleinekorte, Gabriel Mancino-Ball, Benjamin Schiller, Simon Sung, Julian Keupp, Aaron Osburg, Toby Boyne, Ruth Misener, Rosona Eldred, Wagner Steuer Costa, Chrysoula Kappatou, Robert M. Lee, Dominik Linzner, David Walz, Niklas Wulkow, Behrang Shafei Our open-source Python package BoFire combines Bayesian Optimization (BO) with other design of experiments (DoE) strategies focusing on developing and optimizing new chemistry. Previous BO implementations, for example as they exist in the literature or software, require substantial adaptation for effective real-world deployment in chemical industry. BoFire provides a rich feature-set with extensive configurability and realizes our vision of fast-tracking research contributions into industrial use via maintainable open-source software. Owing to quality-of-life features like JSON-serializability of problem formulations, BoFire enables seamless integration of BO into RESTful APIs, a common architecture component for both self-driving laboratories and human-in-the-loop setups. This paper discusses the differences between BoFire and other BO implementations and outlines ways that BO research needs to be adapted for real-world use in a chemistry setting.
Since graph neural networks (GNNs) are often vulnerable to attack, we need to know when we can trust them. We develop a computationally effective approach towards providing robust certificates for message-passing neural networks (MPNNs) using a Rectified Linear Unit (ReLU) activation function. Because our work builds on mixed-integer optimization, it encodes a wide variety of subproblems, for example it admits (i) both adding and removing edges, (ii) both global and local budgets, and (iii) both topological perturbations and feature modifications. Our key technology, topology-based bounds tightening, uses graph structure to tighten bounds. We also experiment with aggressive bounds tightening to dynamically change the optimization constraints by tightening variable bounds. To demonstrate the effectiveness of these strategies, we implement an extension to the open-source branch-and-cut solver SCIP. We test on both node and graph classification problems and consider topological attacks that both add and remove edges.
There has been a surge in interest in data-driven experimental design with applications to chemical engineering and drug manufacturing. Bayesian optimization (BO) has proven to be adaptable to such cases, since we can model the reactions of interest as expensive black-box functions. Sometimes, the cost of this black-box functions can be separated into two parts: (a) the cost of the experiment itself, and (b) the cost of changing the input parameters. In this short paper, we extend the SnAKe algorithm to deal with both types of costs simultaneously. We further propose extensions to the case of a maximum allowable input change, as well as to the multi-objective setting.
May 17 2023
math.OC arXiv:2305.09420v2
Optimization over trained machine learning models has applications including: verification, minimizing neural acquisition functions, and integrating a trained surrogate into a larger decision-making problem. This paper formulates and solves optimization problems constrained by trained graph neural networks (GNNs). To circumvent the symmetry issue caused by graph isomorphism, we propose two types of symmetry-breaking constraints: one indexing a node 0 and one indexing the remaining nodes by lexicographically ordering their neighbor sets. To guarantee that adding these constraints will not remove all symmetric solutions, we construct a graph indexing algorithm and prove that the resulting graph indexing satisfies the proposed symmetry-breaking constraints. For the classical GNN architectures considered in this paper, optimizing over a GNN with a fixed graph is equivalent to optimizing over a dense neural network. Thus, we study the case where the input graph is not fixed, implying that each edge is a decision variable, and develop two mixed-integer optimization formulations. To test our symmetry-breaking strategies and optimization formulations, we consider an application in molecular design.
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search, as they achieve good predictive performance with little or no manual tuning, naturally handle discrete feature spaces, and are relatively insensitive to outliers in the training data. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. To address both points simultaneously, we propose using the kernel interpretation of tree ensembles as a Gaussian Process prior to obtain model variance estimates, and we develop a compatible optimization formulation for the acquisition function. The latter further allows us to seamlessly integrate known constraints to improve sampling efficiency by considering domain-knowledge in engineering settings and modeling search space symmetries, e.g., hierarchical relationships in neural architecture search. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
We develop a class of mixed-integer formulations for disjunctive constraints intermediate to the big-M and convex hull formulations in terms of relaxation strength. The main idea is to capture the best of both the big-M and convex hull formulations: a computationally light formulation with a tight relaxation. The "P-split" formulations are based on a lifted transformation that splits convex additively separable constraints into P partitions and forms the convex hull of the linearized and partitioned disjunction. The "P-split" formulations are derived for disjunctive constraints with convex constraints within each disjuct, and we generalize the results for the case with nonconvex constraints within the disjuncts. We analyze the continuous relaxation of the P-split formulations and show that, under certain assumptions, the formulations form a hierarchy starting from a big-M equivalent and converging to the convex hull. The goal of the P-split formulations is to form strong approximations of the convex hull through a computationally simpler formulation. We computationally compare the P-split formulations against big-M and convex hull formulations on 344 test instances. The test problems include K-means clustering, semi-supervised clustering, P_ball problems, and optimization over trained ReLU neural networks. The computational results show promising potential of the P-split formulations. For many of the test problems, P-split formulations are solved with a similar number of explored nodes as the convex hull formulation, while reducing the solution time by an order of magnitude and outperforming big-M both in time and number of explored nodes.
The optimization and machine learning toolkit (OMLT) is an open-source software package incorporating neural network and gradient-boosted tree surrogate models, which have been trained using machine learning, into larger optimization problems. We discuss the advances in optimization technology that made OMLT possible and show how OMLT seamlessly integrates with the algebraic modeling language Pyomo. We demonstrate how to use OMLT for solving decision-making problems in both computer science and engineering.
Bayesian Optimization is a very effective tool for optimizing expensive black-box functions. Inspired by applications developing and characterizing reaction chemistry using droplet microfluidic reactors, we consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations. We further assume we are working asynchronously, meaning we have to select new queries before evaluating previous experiments. This paper investigates the problem and introduces 'Sequential Bayesian Optimization via Adaptive Connecting Samples' (SnAKe), which provides a solution by considering large batches of queries and preemptively building optimization paths that minimize input costs. We investigate some convergence properties and empirically show that the algorithm is able to achieve regret similar to classical Bayesian Optimization algorithms in both synchronous and asynchronous settings, while reducing input costs significantly. We show the method is robust to the choice of its single hyper-parameter and provide a parameter-free alternative.
It is well-documented how artificial intelligence can have (and already is having) a big impact on chemical engineering. But classical machine learning approaches may be weak for many chemical engineering applications. This review discusses how challenging data characteristics arise in chemical engineering applications. We identify four characteristics of data arising in chemical engineering applications that make applying classical artificial intelligence approaches difficult: (1) high variance, low volume data, (2) low variance, high volume data, (3) noisy/corrupt/missing data, and (4) restricted data with physics-based limitations. For each of these four data characteristics, we discuss applications where these data characteristics arise and show how current chemical engineering research is extending the fields of data science and machine learning to incorporate these challenges. Finally, we identify several challenges for future research.
Energy systems optimization problems are complex due to strongly non-linear system behavior and multiple competing objectives, e.g. economic gain vs. environmental impact. Moreover, a large number of input variables and different variable types, e.g. continuous and categorical, are challenges commonly present in real-world applications. In some cases, proposed optimal solutions need to obey explicit input constraints related to physical properties or safety-critical operating conditions. This paper proposes a novel data-driven strategy using tree ensembles for constrained multi-objective optimization of black-box problems with heterogeneous variable spaces for which underlying system dynamics are either too complex to model or unknown. In an extensive case study comprised of synthetic benchmarks and relevant energy applications we demonstrate the competitive performance and sampling efficiency of the proposed algorithm compared to other state-of-the-art tools, making it a useful all-in-one solution for real-world applications with limited evaluation budgets.
May 19 2021
math.OC arXiv:2105.08598v1
This paper introduces ROmodel, an open source Python package extending the modeling capabilities of the algebraic modeling language Pyomo to robust optimization problems. ROmodel helps practitioners transition from deterministic to robust optimization through modeling objects which allow formulating robust models in close analogy to their mathematical formulation. ROmodel contains a library of commonly used uncertainty sets which can be generated using their matrix representations, but it also allows users to define custom uncertainty sets using Pyomo constraints. ROmodel supports adjustable variables via linear decision rules. The resulting models can be solved using ROmodels solvers which implement both the robust reformulation and cutting plane approach. ROmodel is a platform to implement and compare custom uncertainty sets and reformulations. We demonstrate ROmodel's capabilities by applying it to six case studies. We implement custom uncertainty sets based on (warped) Gaussian processes to show how ROmodel can integrate data-driven models with optimization.
May 06 2021
math.OC arXiv:2105.01687v1
This paper presents a Python library to model pooling problems, a class of network flow problems with many engineering applications. The library automatically generates a mixed-integer quadratically-constrained quadratic optimization problem from a given network structure. The library additionally uses the network structure to build 1) a convex linear relaxation of the non-convex quadratic program and 2) a mixed-integer linear restriction of the problem. We integrate the pooling network library with galini, an open-source extensible global solver for quadratic optimization. We demonstrate galini's extensible characteristics by using the pooling library to develop two galini plug-ins: 1) a cut generator plug-in that adds valid inequalities in the galini cut loop and 2) a primal heuristic plug-in that uses the mixed-integer linear restriction. We test galini on large scale pooling problems and show that, thanks to the good upper bound provided by the mixed-integer linear restriction and the good lower bounds provided by the convex relaxation, we obtain optimality gaps that are competitive with Gurobi 9.1 on the largest problem instances.
This paper introduces a class of mixed-integer formulations for trained ReLU neural networks. The approach balances model size and tightness by partitioning node inputs into a number of groups and forming the convex hull over the partitions via disjunctive programming. At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node. For fewer partitions, we develop smaller relaxations that approximate the convex hull, and show that they outperform existing formulations. Specifically, we propose strategies for partitioning variables based on theoretical motivations and validate these strategies using extensive computational experiments. Furthermore, the proposed scheme complements known algorithmic approaches, e.g., optimization-based bound tightening captures dependencies within a partition.
This work develops a class of relaxations in between the big-M and convex hull formulations of disjunctions, drawing advantages from both. The proposed "P-split" formulations split convex additively separable constraints into P partitions and form the convex hull of the partitioned disjuncts. Parameter P represents the trade-off of model size vs. relaxation strength. We examine the novel formulations and prove that, under certain assumptions, the relaxations form a hierarchy starting from a big-M equivalent and converging to the convex hull. We computationally compare the proposed formulations to big-M and convex hull formulations on a test set including: K-means clustering, P_ball problems, and ReLU neural networks. The computational results show that the intermediate P-split formulations can form strong outer approximations of the convex hull with fewer variables and constraints than the extended convex hull formulations, giving significant computational advantages over both the big-M and convex hull.
Jun 16 2020
math.OC arXiv:2006.08222v1
Optimization problems with uncertain black-box constraints, modeled by warped Gaussian processes, have recently been considered in the Bayesian optimization setting. This work introduces a new class of constraints in which the same black-box function occurs multiple times evaluated at different domain points. Such constraints are important in applications where, e.g., safety-critical measures are aggregated over multiple time periods. Our approach, which uses robust optimization, reformulates these uncertain constraints into deterministic constraints guaranteed to be satisfied with a specified probability, i.e., deterministic approximations to a chance constraint. This approach extends robust optimization methods from parametric uncertainty to uncertain functions modeled by warped Gaussian processes. We analyze convexity conditions and propose a custom global optimization strategy for non-convex cases. A case study derived from production planning and an industrially relevant example from oil well drilling show that the approach effectively mitigates uncertainty in the learned curves. For the drill scheduling example, we develop a custom strategy for globally optimizing integer decisions.
Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks.
Motivated by mail delivery scheduling problems arising in Royal Mail, we study a generalization of the fundamental makespan scheduling P||Cmax problem which we call the bounded job start scheduling problem. Given a set of jobs, each specified by an integer processing time p_j, that have to be executed non-preemptively by a set of m parallel identical machines, the objective is to compute a minimum makespan schedule subject to an upper bound g<=m on the number of jobs that may simultaneously begin per unit of time. With perfect input knowledge, we show that Longest Processing Time First (LPT) algorithm is tightly 2-approximate. After proving that the problem is strongly NP-hard even when g=1, we elaborate on improving the 2-approximation ratio for this case. We distinguish the classes of long and short instances satisfying p_j>=m and p_j<m, respectively, for each job j. We show that LPT is 5/3-approximate for the former and optimal for the latter. Then, we explore scheduling long jobs in parallel with short jobs to obtain tightly satisfied packing and bounded job start constraints. For a broad family of instances excluding degenerate instances with many very long jobs, we derive a 1.985-approximation ratio. For general instances, we require machine augmentation to obtain better than 2-approximate schedules. Under uncertain job processing times, we exploit machine augmentation and lexicographic optimization to propose a two-stage robust optimization approach for bounded job start scheduling under uncertainty aiming in a low number of used machines. Given a collection of schedules of makespan <= D, this approach allows distinguishing which are the more robust. We substantiate both the heuristics and our recovery approach numerically using Royal Mail data. We show that, for the Royal Mail application, machine augmentation, i.e. short-term van rental, is especially relevant.
Designing and analyzing algorithms with provable performance guarantees enables efficient optimization problem solving in different application domains, e.g.\ communication networks, transportation, economics, and manufacturing. Despite the significant contributions of approximation algorithms in engineering, only limited and isolated works contribute from this perspective in process systems engineering. The current paper discusses three representative, NP-hard problems in process systems engineering: (i) pooling, (ii) process scheduling, and (iii) heat exchanger network synthesis. We survey relevant results and raise major open questions. Further, we present approximation algorithms applications which are relevant to process systems engineering: (i) better mathematical modeling, (ii) problem classification, (iii) designing solution methods, and (iv) dealing with uncertainty. This paper aims to motivate further research at the intersection of approximation algorithms and process systems engineering.
Jun 19 2019
math.OC arXiv:1906.07612v1
The pooling problem has applications, e.g., in petrochemical refining, water networks, and supply chains and is widely studied in global optimization. To date, it has largely been treated deterministically, neglecting the influence of parametric uncertainty. This paper applies two robust optimization approaches, reformulation and cutting planes, to the non-linear, non-convex pooling problem. Most applications of robust optimization have been either convex or mixed-integer linear problems. We explore the suitability of robust optimization in the context of global optimization problems which are concave in the uncertain parameters by considering the pooling problem with uncertain inlet concentrations. We compare the computational efficiency of reformulation and cutting plane approaches for three commonly-used uncertainty set geometries on 14 pooling problem instances and demonstrate how accounting for uncertainty changes the optimal solution.
Oct 23 2018
math.OC arXiv:1810.09289v2
In chemical and manufacturing processes, unit failures due to equipment degradation can lead to process downtime and significant costs. In this context, finding an optimal maintenance strategy to ensure good unit health while avoiding excessive expensive maintenance activities is highly relevant. We propose a practical approach for the integrated optimization of production and maintenance capable of incorporating uncertain sensor data regarding equipment degradation. To this end, we integrate data-driven stochastic degradation models from Condition-based Maintenance into a process level mixed-integer optimization problem using Robust Optimization. We reduce computational expense by utilizing both analytical and data-based approximations and optimize the Robust optimization parameters using Bayesian Optimization. We apply our framework to five instances of the State-Task-Network and demonstrate that it can efficiently compromise between equipment availability and cost of maintenance.
In industrial resource allocation problems, an initial planning stage may solve a nominal problem instance and a subsequent recovery stage may intervene to repair inefficiencies and infeasibilities due to uncertainty, e.g.\ machine failures and job processing time variations. In this context, we investigate the minimum makespan scheduling problem, a.k.a.\ $P||C_{\max}$, under uncertainty. We propose a two-stage robust scheduling approach where first-stage decisions are computed with exact lexicographic scheduling and second-stage decisions are derived using approximate rescheduling. We explore recovery strategies accounting for planning decisions and constrained by limited permitted deviations from the original schedule. Our approach is substantiated analytically, with a price of robustness characterization parameterized by the degree of uncertainty, and numerically. This analysis is based on optimal substructure imposed by lexicographic optimality. Thus, lexicographic optimization enables more efficient rescheduling. Further, we revisit state-of-the-art exact lexicographic optimization methods and propose a lexicographic branch-and-bound algorithm whose performance is validated computationally.
Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees and penalty functions mitigating risk. This mixed-integer optimization problem with convex penalty terms broadly applies to optimizing pre-trained regression tree models. Decision makers may wish to optimize discrete models to repurpose legacy predictive models, or they may wish to optimize a discrete model that particularly well-represents a data set. We develop several heuristic methods to find feasible solutions, and an exact, branch-and-bound algorithm leveraging structural properties of the gradient-boosted trees and penalty functions. We computationally test our methods on concrete mixture design instance and a chemical catalysis industrial instance.
Symmetry in mathematical programming may lead to a multiplicity of solutions. In nonconvex optimisation, it can negatively affect the performance of the branch-and-bound algorithm. Symmetry may induce large search trees with multiple equivalent solutions, i.e. with the same optimal value. Dealing with symmetry requires detecting and classifying it first. This work develops methods for detecting groups of symmetry in the formulation of quadratically constrained quadratic optimisation problems via adjacency matrices. Using graph theory, we transform these matrices into Binary Layered Graphs (BLG) and enter them into the software package nauty. Nauty generates important symmetric properties of the original problem.
Nov 08 2017
math.OC arXiv:1711.02564v1
Symmetry in mathematical optimisation is of broad and current interest. In problem classes such as mixed-integer linear programming (MILP), equivalent solutions created by symmetric variables and constraints may combinatorially increase the search space. Identifying problem symmetries is an important step towards expediting tree-based algorithms such as branch-and-cut because computationally classifying equivalence allows state-of-the-art solver software to omit symmetric solutions. But symmetry has not been characterised in several critically important process systems engineering applications such as heat exchanger network synthesis; neither do current MILP solvers detect or use symmetries for these energy efficiency problems. This paper uses group theory to study the MILP transshipment model of heat exchanger network synthesis and identifies several types of symmetry arising in the problem.
Heat exchanger network synthesis exploits excess heat by integrating process hot and cold streams and improves energy efficiency by reducing utility usage. Determining provably good solutions to the minimum number of matches is a bottleneck of designing a heat recovery network using the sequential method. This subproblem is an NP-hard mixed-integer linear program exhibiting combinatorial explosion in the possible hot and cold stream configurations. We explore this challenging optimization problem from a graph theoretic perspective and correlate it with other special optimization problems such as cost flow network and packing problems. In the case of a single temperature interval, we develop a new optimization formulation without problematic big-M parameters. We develop heuristic methods with performance guarantees using three approaches: (i) relaxation rounding, (ii) water filling, and (iii) greedy packing. Numerical results from a collection of 51 instances substantiate the strength of the methods.
Bayesian Optimization (BO) is a data-efficient method for global black-box optimization of an expensive-to-evaluate fitness function. BO typically assumes that computation cost of BO is cheap, but experiments are time consuming or costly. In practice, this allows us to optimize ten or fewer critical parameters in up to 1,000 experiments. But experiments may be less expensive than BO methods assume: In some simulation models, we may be able to conduct multiple thousands of experiments in a few hours, and the computational burden of BO is no longer negligible compared to experimentation time. To address this challenge we introduce a new Dimension Scheduling Algorithm (DSA), which reduces the computational burden of BO for many experiments. The key idea is that DSA optimizes the fitness function only along a small set of dimensions at each iteration. This DSA strategy (1) reduces the necessary computation time, (2) finds good solutions faster than the traditional BO method, and (3) can be parallelized straightforwardly. We evaluate the DSA in the context of optimizing parameters of dynamic models of microalgae metabolism and show faster convergence than traditional BO.