-
Curb Your Attention: Causal Attention Gating for Robust Trajectory Prediction in Autonomous Driving
Authors:
Ehsan Ahmadi,
Ray Mercurius,
Soheil Alizadeh,
Kasra Rezaee,
Amir Rasouli
Abstract:
Trajectory prediction models in autonomous driving are vulnerable to perturbations from non-causal agents whose actions should not affect the ego-agent's behavior. Such perturbations can lead to incorrect predictions of other agents' trajectories, potentially compromising the safety and efficiency of the ego-vehicle's decision-making process. Motivated by this challenge, we propose…
▽ More
Trajectory prediction models in autonomous driving are vulnerable to perturbations from non-causal agents whose actions should not affect the ego-agent's behavior. Such perturbations can lead to incorrect predictions of other agents' trajectories, potentially compromising the safety and efficiency of the ego-vehicle's decision-making process. Motivated by this challenge, we propose $\textit{Causal tRajecTory predICtion}$ $\textbf{(CRiTIC)}$, a novel model that utilizes a $\textit{Causal Discovery Network}$ to identify inter-agent causal relations over a window of past time steps. To incorporate discovered causal relationships, we propose a novel $\textit{Causal Attention Gating}$ mechanism to selectively filter information in the proposed Transformer-based architecture. We conduct extensive experiments on two autonomous driving benchmark datasets to evaluate the robustness of our model against non-causal perturbations and its generalization capacity. Our results indicate that the robustness of predictions can be improved by up to $\textbf{54%}$ without a significant detriment to prediction accuracy. Lastly, we demonstrate the superior domain generalizability of the proposed model, which achieves up to $\textbf{29%}$ improvement in cross-domain performance. These results underscore the potential of our model to enhance both robustness and generalization capacity for trajectory prediction in diverse autonomous driving domains. Further details can be found on our project page: https://critic-model.github.io/.
△ Less
Submitted 23 September, 2024;
originally announced October 2024.
-
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
Authors:
S. Chandra Mouli,
Danielle C. Maddix,
Shima Alizadeh,
Gaurav Gupta,
Andrew Stuart,
Michael W. Mahoney,
Yuyang Wang
Abstract:
Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve…
▽ More
Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are even moderately out-of-domain (OOD), even when the model approximates the solution well for in-domain tasks. To address this limitation, we show that ensembling several NOs can identify high-error regions and provide good uncertainty estimates that are well-correlated with prediction errors. Based on this, we propose a cost-effective alternative, DiverseNO, that mimics the properties of the ensemble by encouraging diverse predictions from its multiple heads in the last feed-forward layer. We then introduce Operator-ProbConserv, a method that uses these well-calibrated UQ estimates within the ProbConserv framework to update the model. Our empirical results show that Operator-ProbConserv enhances OOD model performance for a variety of challenging PDE problems and satisfies physical constraints such as conservation laws.
△ Less
Submitted 12 June, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Pessimistic Off-Policy Multi-Objective Optimization
Authors:
Shima Alizadeh,
Aniruddha Bhargava,
Karthick Gopalswamy,
Lalit Jain,
Branislav Kveton,
Ge Liu
Abstract:
Multi-objective optimization is a type of decision making problems where multiple conflicting objectives are optimized. We study offline optimization of multi-objective policies from data collected by an existing policy. We propose a pessimistic estimator for the multi-objective policy values that can be easily plugged into existing formulas for hypervolume computation and optimized. The estimator…
▽ More
Multi-objective optimization is a type of decision making problems where multiple conflicting objectives are optimized. We study offline optimization of multi-objective policies from data collected by an existing policy. We propose a pessimistic estimator for the multi-objective policy values that can be easily plugged into existing formulas for hypervolume computation and optimized. The estimator is based on inverse propensity scores (IPS), and improves upon a naive IPS estimator in both theory and experiments. Our analysis is general, and applies beyond our IPS estimators and methods for optimizing them. The pessimistic estimator can be optimized by policy gradients and performs well in all of our experiments.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Learning Physical Models that Can Respect Conservation Laws
Authors:
Derek Hansen,
Danielle C. Maddix,
Shima Alizadeh,
Gaurav Gupta,
Michael W. Mahoney
Abstract:
Recent work in scientific machine learning (SciML) has focused on incorporating partial differential equation (PDE) information into the learning process. Much of this work has focused on relatively "easy" PDE operators (e.g., elliptic and parabolic), with less emphasis on relatively "hard" PDE operators (e.g., hyperbolic). Within numerical PDEs, the latter problem class requires control of a type…
▽ More
Recent work in scientific machine learning (SciML) has focused on incorporating partial differential equation (PDE) information into the learning process. Much of this work has focused on relatively "easy" PDE operators (e.g., elliptic and parabolic), with less emphasis on relatively "hard" PDE operators (e.g., hyperbolic). Within numerical PDEs, the latter problem class requires control of a type of volume element or conservation constraint, which is known to be challenging. Delivering on the promise of SciML requires seamlessly incorporating both types of problems into the learning process. To address this issue, we propose ProbConserv, a framework for incorporating conservation constraints into a generic SciML architecture. To do so, ProbConserv combines the integral form of a conservation law with a Bayesian update. We provide a detailed analysis of ProbConserv on learning with the Generalized Porous Medium Equation (GPME), a widely-applicable parameterized family of PDEs that illustrates the qualitative properties of both easier and harder PDEs. ProbConserv is effective for easy GPME variants, performing well with state-of-the-art competitors; and for harder GPME variants it outperforms other approaches that do not guarantee volume conservation. ProbConserv seamlessly enforces physical conservation constraints, maintains probabilistic uncertainty quantification (UQ), and deals well with shocks and heteroscedasticities. In each case, it achieves superior predictive performance on downstream tasks.
△ Less
Submitted 10 October, 2023; v1 submitted 21 February, 2023;
originally announced February 2023.
-
Guiding continuous operator learning through Physics-based boundary constraints
Authors:
Nadim Saad,
Gaurav Gupta,
Shima Alizadeh,
Danielle C. Maddix
Abstract:
Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training…
▽ More
Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training data to help the model learn BCs implicitly. There is no guarantee of BC satisfaction by these models during evaluation. In this work, we propose Boundary enforcing Operator Network (BOON) that enables the BC satisfaction of neural operators by making structural changes to the operator kernel. We provide our refinement procedure, and demonstrate the satisfaction of physics-based BCs, e.g. Dirichlet, Neumann, and periodic by the solutions obtained by BOON. Numerical experiments based on multiple PDEs with a wide variety of applications indicate that the proposed approach ensures satisfaction of BCs, and leads to more accurate solutions over the entire domain. The proposed correction method exhibits a (2X-20X) improvement over a given operator model in relative $L^2$ error (0.000084 relative $L^2$ error for Burgers' equation).
△ Less
Submitted 2 March, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
Controllability of complex networks: input node placement restricting the longest control chain
Authors:
Samie Alizadeh,
Márton Pósfai,
Abdorasoul Ghasemi
Abstract:
The minimum number of inputs needed to control a network is frequently used to quantify its controllability. Control of linear dynamics through a minimum set of inputs, however, often has prohibitively large energy requirements and there is an inherent trade-off between minimizing the number of inputs and control energy. To better understand this trade-off, we study the problem of identifying a mi…
▽ More
The minimum number of inputs needed to control a network is frequently used to quantify its controllability. Control of linear dynamics through a minimum set of inputs, however, often has prohibitively large energy requirements and there is an inherent trade-off between minimizing the number of inputs and control energy. To better understand this trade-off, we study the problem of identifying a minimum set of input nodes such that controllabililty is ensured while restricting the length of the longest control chain. The longest control chain is the maximum distance from input nodes to any network node, and recent work found that reducing its length significantly reduces control energy. We map the longest control chain-constraint minimum input problem to finding a joint maximum matching and minimum dominating set. We show that this graph combinatorial problem is NP-complete, and we introduce and validate a heuristic approximation. Applying this algorithm to a collection of real and model networks, we investigate how network structure affects the minimum number of inputs, revealing, for example, that for many real networks reducing the longest control chain requires only few or no additional inputs, only the rearrangement of the input nodes.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
NeurIPS 2022 Competition: Driving SMARTS
Authors:
Amir Rasouli,
Randy Goebel,
Matthew E. Taylor,
Iuliia Kotseruba,
Soheil Alizadeh,
Tianpei Yang,
Montgomery Alban,
Florian Shkurti,
Yuzheng Zhuang,
Adam Scibior,
Kasra Rezaee,
Animesh Garg,
David Meger,
Jun Luo,
Liam Paull,
Weinan Zhang,
Xinyu Wang,
Xi Chen
Abstract:
Driving SMARTS is a regular competition designed to tackle problems caused by the distribution shift in dynamic interaction contexts that are prevalent in real-world autonomous driving (AD). The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods, trained on a combination of naturalistic AD data and open-source simulati…
▽ More
Driving SMARTS is a regular competition designed to tackle problems caused by the distribution shift in dynamic interaction contexts that are prevalent in real-world autonomous driving (AD). The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods, trained on a combination of naturalistic AD data and open-source simulation platform SMARTS. The two-track structure allows focusing on different aspects of the distribution shift. Track 1 is open to any method and will give ML researchers with different backgrounds an opportunity to solve a real-world autonomous driving challenge. Track 2 is designed for strictly offline learning methods. Therefore, direct comparisons can be made between different methods with the aim to identify new promising research directions. The proposed setup consists of 1) realistic traffic generated using real-world data and micro simulators to ensure fidelity of the scenarios, 2) framework accommodating diverse methods for solving the problem, and 3) baseline method. As such it provides a unique opportunity for the principled investigation into various aspects of autonomous vehicle deployment.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Overlimiting current in non-uniform arrays of microchannels
Authors:
Hyekyung Lee,
Shima Alizadeh,
Tae Jin Kim,
Seung-min Park,
Hyongsok Tom Soh,
Ali Mani,
Sung Jae Kim
Abstract:
Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and…
▽ More
Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and electro-osmotic instability depending on input parameters such as surface charge and geometric constrictions. This work studies how a network of microchannels in a non-uniform array, which mimicks a natural pore configuration, can contribute to OLC. To this end, micro/nanofluidic devices are fabricated with arrays of parallel microchannels with non-uniform size distributions. All cases maintain the same surface and bulk conduction to allow probing the sensitivity only by the non-uniformity of the channels. Both experimental and theoretical current-voltage relations demonstrate that OLCs increase with increasing non-uniformity. Furthermore, the visualization of internal recirculating flows indicates that the non-uniform arrays induce flow loops across the network enhancing advective transport. These evidences confirm a new driving mechanism of OLC, inspired by natural micro/nanoporous materials with random geometric structure. Therefore, this result can advance not only the fundamental understanding of nanoelectrokinetics but also the design rule of engineering applications of electrochemical membrane.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Impact of Network Heterogenity on Nonlinear Electrokinetic Transport in Porous Media
Authors:
Shima Alizadeh,
Martin Z. Bazant,
Ali Mani
Abstract:
We present a numerical study of nonlinear electrokinetic transport in porous media, focusing on the role of heterogeneity in a porous microstructure on ion concentration polarization and over-limiting current. For simplicity, the porous medium is modeled as a network of long, thin charged cylindrical pores, each governed by one-dimensional effective transport equations. For weak surface conduction…
▽ More
We present a numerical study of nonlinear electrokinetic transport in porous media, focusing on the role of heterogeneity in a porous microstructure on ion concentration polarization and over-limiting current. For simplicity, the porous medium is modeled as a network of long, thin charged cylindrical pores, each governed by one-dimensional effective transport equations. For weak surface conduction, when sufficiently large potential is applied, we demonstrate that electrokinetic transport in a porous network can be dominated by electroconvection via internally induced flow loops, which is not properly captured by existing homogenized models. We systematically vary the topology and "accessivity" of the pore network and compare with simulations of traditional homogenized parallel-pore (capillary-bundle) models, in order to reveal the effects of regular and hierarchical connectivity. Our computational framework sheds light on the complex physics of nonlinear electrokinetic phenomena in microstructures and may be used to design porous media for applications, such as water desalination and purification by shock electrodialysis.
△ Less
Submitted 8 January, 2019;
originally announced January 2019.
-
Convolutional Neural Networks for Facial Expression Recognition
Authors:
Shima Alizadeh,
Azar Fazel
Abstract:
We have developed convolutional neural networks (CNN) for a facial expression recognition task. The goal is to classify each facial image into one of the seven facial emotion categories considered in this study. We trained CNN models with different depth using gray-scale images. We developed our models in Torch and exploited Graphics Processing Unit (GPU) computation in order to expedite the train…
▽ More
We have developed convolutional neural networks (CNN) for a facial expression recognition task. The goal is to classify each facial image into one of the seven facial emotion categories considered in this study. We trained CNN models with different depth using gray-scale images. We developed our models in Torch and exploited Graphics Processing Unit (GPU) computation in order to expedite the training process. In addition to the networks performing based on raw pixel data, we employed a hybrid feature strategy by which we trained a novel CNN model with the combination of raw pixel data and Histogram of Oriented Gradients (HOG) features. To reduce the overfitting of the models, we utilized different techniques including dropout and batch normalization in addition to L2 regularization. We applied cross validation to determine the optimal hyper-parameters and evaluated the performance of the developed models by looking at their training histories. We also present the visualization of different layers of a network to show what features of a face can be learned by CNN models.
△ Less
Submitted 22 April, 2017;
originally announced April 2017.
-
A multi-scale model for electrokinetic transport in networks of micro-scale and nano-scale pores
Authors:
Shima Alizadeh,
Ali Mani
Abstract:
We present an efficient and robust numerical model for simulation of electrokinetic phenomena in porous networks over a wide range of applications including energy conversion, desalination, and lab-on-a-chip systems. Coupling between fluid flow and ion transport in these networks is governed by the Poisson-Nernst-Planck-Stokes equations. These equations describe a wide range of transport phenomena…
▽ More
We present an efficient and robust numerical model for simulation of electrokinetic phenomena in porous networks over a wide range of applications including energy conversion, desalination, and lab-on-a-chip systems. Coupling between fluid flow and ion transport in these networks is governed by the Poisson-Nernst-Planck-Stokes equations. These equations describe a wide range of transport phenomena that can interact in complex and highly nonlinear ways in networks involving multiple pores with variable properties. Capturing these phenomena by direct simulation of the governing equations in multiple dimensions is prohibitively expensive. We present here a reduced order computational model that treats a network of many pores via solutions to 1D equations. Assuming that each pore in the network is long and thin, we derive a 1D model describing the transport in pore's longitudinal direction. We take into account the non-uniformity of potential and ion concentration profiles across the pore cross-section in the form of area-averaged coefficients in different flux terms representing fluid flow, electric current, and ion fluxes. Distinct advantages of the present framework include: a fully conservative discretization, fully bounded tabulated area-averaged coefficients without any singularity in the limit of infinitely thick electric double layers (EDLs), a flux discretization that exactly preserves equilibrium conditions, and extension to general network of pores with multiple intersections. By considering a hierarchy of canonical problems with increasing complexity, we demonstrate that the developed framework can capture a wide range of phenomena. Example demonstrations include, prediction of osmotic pressure built up in thin pores subject to concentration gradient, propagation of deionization shocks and induced recirculations for intersecting pores with varying properties.
△ Less
Submitted 29 September, 2016;
originally announced October 2016.
-
Weak convergence theorems for symmetric generalized hybrid mappings in uniformly convex Banach spaces
Authors:
Fridoun Moradlou,
Sattar Alizadeh
Abstract:
In this paper, we prove some theorems related to properties of generalized symmetric hybrid mappings in Banach spaces. Using Banach limits, we prove a fixed point theorem for symmetric generalized hybrid mappings in Banach spaces. Moreover, we prove some weak convergence theorems for such mappings by using Ishikawa iteration method in a uniformly convex Banach space.
In this paper, we prove some theorems related to properties of generalized symmetric hybrid mappings in Banach spaces. Using Banach limits, we prove a fixed point theorem for symmetric generalized hybrid mappings in Banach spaces. Moreover, we prove some weak convergence theorems for such mappings by using Ishikawa iteration method in a uniformly convex Banach space.
△ Less
Submitted 22 November, 2014; v1 submitted 20 October, 2014;
originally announced October 2014.
-
Weak convergence theorems for equilibrium problems and generalized Hybrid mappings
Authors:
Sattar Alizadeh,
Fridoun Moradlou
Abstract:
In this paper, we introduce a new modified Ishikawa iteration for finding a common element of the set of solutions of an equilibrium problem and the set of fixed points of generalized hybrid mappings in a Hilbert space. Our results generalize, extend and enrich some existing results in the literature.
In this paper, we introduce a new modified Ishikawa iteration for finding a common element of the set of solutions of an equilibrium problem and the set of fixed points of generalized hybrid mappings in a Hilbert space. Our results generalize, extend and enrich some existing results in the literature.
△ Less
Submitted 19 October, 2014;
originally announced October 2014.
-
Strong convergence theorems by a new hybrid method for equilibrium problems and relatively nonexpansive mappings in Banach spaces
Authors:
Sattar Alizadeh,
Fridoun Moradlou
Abstract:
In this paper, we introduce a new modified Ishikawa iteration for finding a common element of the set of solutions of an equilibrium problem and the set of fixed points of relatively nonexpansive mappings in a Banach space. Our results generalize, extend and enrich some existing results in the literature.
In this paper, we introduce a new modified Ishikawa iteration for finding a common element of the set of solutions of an equilibrium problem and the set of fixed points of relatively nonexpansive mappings in a Banach space. Our results generalize, extend and enrich some existing results in the literature.
△ Less
Submitted 15 October, 2014;
originally announced October 2014.
-
Markov Switching Component ARCH Model: Stability and Forecasting
Authors:
N. Alemohammad,
S. Rezakhah,
S. H. Alizadeh
Abstract:
This paper introduces an extension of the Markov switching GARCH model where the volatility in each state is a convex combination of two different GARCH components with time varying weights. This model has the dynamic behavior to capture the variants of shocks. The asymptotic behavior of the second moment is investigated and an appropriate upper bound for it is evaluated. The estimation of the par…
▽ More
This paper introduces an extension of the Markov switching GARCH model where the volatility in each state is a convex combination of two different GARCH components with time varying weights. This model has the dynamic behavior to capture the variants of shocks. The asymptotic behavior of the second moment is investigated and an appropriate upper bound for it is evaluated. The estimation of the parameters by using the Bayesian method via Gibbs sampling algorithm is studied. Finally we illustrate the efficiency of the model by simulation and empirical analysis. We show that this model provides a much better forecast of the volatility than the Markov switching GARCH model.
△ Less
Submitted 19 February, 2014; v1 submitted 22 March, 2013;
originally announced March 2013.
-
Hidden Markov Mixture Autoregressive Models: Parameter Estimation
Authors:
S. H. Alizadeh,
S. Rezakhah
Abstract:
This report introduces a parsimonious structure for mixture of autoregressive models, where the weighting coefficients are determined through latent random variables as functions of all past observations. These variables follow a hidden Markov model. We modify EM and Baum-Welch algorithms to estimate the parameters of the model.
This report introduces a parsimonious structure for mixture of autoregressive models, where the weighting coefficients are determined through latent random variables as functions of all past observations. These variables follow a hidden Markov model. We modify EM and Baum-Welch algorithms to estimate the parameters of the model.
△ Less
Submitted 14 May, 2011;
originally announced May 2011.
-
Hidden Markov Mixture Autoregressive Models: Stability and Moments
Authors:
S. H. Alizadeh,
S. Rezakhah
Abstract:
This paper introduces a new parsimonious structure for mixture of autoregressive models. the weighting coefficients are determined through latent random variables, following a hidden Markov model. We propose a dynamic programming algorithm for the application of forecasting. We also derive the limiting behavior of unconditional first moment of the process and an appropriate upper bound for the lim…
▽ More
This paper introduces a new parsimonious structure for mixture of autoregressive models. the weighting coefficients are determined through latent random variables, following a hidden Markov model. We propose a dynamic programming algorithm for the application of forecasting. We also derive the limiting behavior of unconditional first moment of the process and an appropriate upper bound for the limiting value of the variance. This can be considered as long run behavior of the process. Finally we show convergence and stability of the second moment. Further, we illustrate the efficacy of the proposed model by simulation and forecasting.
△ Less
Submitted 11 May, 2011; v1 submitted 5 May, 2011;
originally announced May 2011.