-
Uncertainty Quantification for Deep Learning
Authors:
Peter Jan van Leeuwen,
J. Christine Chiu,
C. Kevin Yang
Abstract:
A complete and statistically consistent uncertainty quantification for deep learning is provided, including the sources of uncertainty arising from (1) the new input data, (2) the training and testing data (3) the weight vectors of the neural network, and (4) the neural network because it is not a perfect predictor. Using Bayes Theorem and conditional probability densities, we demonstrate how each…
▽ More
A complete and statistically consistent uncertainty quantification for deep learning is provided, including the sources of uncertainty arising from (1) the new input data, (2) the training and testing data (3) the weight vectors of the neural network, and (4) the neural network because it is not a perfect predictor. Using Bayes Theorem and conditional probability densities, we demonstrate how each uncertainty source can be systematically quantified. We also introduce a fast and practical way to incorporate and combine all sources of errors for the first time. For illustration, the new method is applied to quantify errors in cloud autoconversion rates, predicted from an artificial neural network that was trained by aircraft cloud probe measurements in the Azores and the stochastic collection equation formulated as a two-moment bin model. For this specific example, the output uncertainty arising from uncertainty in the training and testing data is dominant, followed by uncertainty in the input data, in the trained neural network, and uncertainty in the weights. We discuss the usefulness of the methodology for machine learning practice, and how, through inclusion of uncertainty in the training data, the new methodology is less sensitive to input data that falls outside of the training data set.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Noise calibration for the stochastic rotating shallow water model
Authors:
Dan Crisan,
Oana Lang,
Alexander Lobbe,
Peter Jan van Leeuwen,
Roland Potthast
Abstract:
Stochastic partial differential equations have been used in a variety of contexts to model the evolution of uncertain dynamical systems. In recent years, their applications to geophysical fluid dynamics has increased massively. For a judicious usage in modelling fluid evolution, one needs to calibrate the amplitude of the noise to data. In this paper we address this requirement for the stochastic…
▽ More
Stochastic partial differential equations have been used in a variety of contexts to model the evolution of uncertain dynamical systems. In recent years, their applications to geophysical fluid dynamics has increased massively. For a judicious usage in modelling fluid evolution, one needs to calibrate the amplitude of the noise to data. In this paper we address this requirement for the stochastic rotating shallow water (SRSW) model. This work is a continuation of [LvLCP23], where a data assimilation methodology has been introduced for the SRSW model. The noise used in [LvLCP23] was introduced as an arbitrary random phase shift in the Fourier space. This is not necessarily consistent with the uncertainty induced by a model reduction procedure. In this paper, we introduce a new method of noise calibration of the SRSW model which is compatible with the model reduction technique. The method is generic and can be applied to arbitrary stochastic parametrizations. It is also agnostic as to the source of data (real or synthetic). It is based on a principal component analysis technique to generate the eigenvectors and the eigenvalues of the covariance matrix of the stochastic parametrization. For SRSW model covered in this paper, we calibrate the noise by using the elevation variable of the model, as this is an observable easily obtainable in practical application, and use synthetic data as input for the calibration procedure.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Particle Filtering and Gaussian Mixtures -- On a Localized Mixture Coefficients Particle Filter (LMCPF) for global NWP
Authors:
Anne Rojahn,
Nora Schenk,
Peter Jan van Leeuwen,
Roland Potthast
Abstract:
In a global numerical weather prediction (NWP) modeling framework we study the implementation of Gaussian uncertainty of individual particles into the assimilation step of a localized adaptive particle filter (LAPF). We obtain a local representation of the prior distribution as a mixture of basis functions. In the assimilation step, the filter calculates the individual weight coefficients and new…
▽ More
In a global numerical weather prediction (NWP) modeling framework we study the implementation of Gaussian uncertainty of individual particles into the assimilation step of a localized adaptive particle filter (LAPF). We obtain a local representation of the prior distribution as a mixture of basis functions. In the assimilation step, the filter calculates the individual weight coefficients and new particle locations. It can be viewed as a combination of the LAPF and a localized version of a Gaussian mixture filter, i.e., a Localized Mixture Coefficients Particle Filter (LMCPF).
Here, we investigate the feasibility of the LMCPF within a global operational framework and evaluate the relationship between prior and posterior distributions and observations. Our simulations are carried out in a standard pre-operational experimental set-up with the full global observing system, 52 km global resolution and $10^6$ model variables. Statistics of particle movement in the assimilation step are calculated. The mixture approach is able to deal with the discrepancy between prior distributions and observation location in a real-world framework and to pull the particles towards the observations in a much better way than the pure LAPF. This shows that using Gaussian uncertainty can be an important tool to improve the analysis and forecast quality in a particle filter framework.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Bayesian Inference for Fluid Dynamics: A Case Study for the Stochastic Rotating Shallow Water Model
Authors:
Peter Jan van Leeuwen,
Dan Crisan,
Oana Lang,
Roland Potthast
Abstract:
In this work, we use a tempering-based adaptive particle filter to infer from a partially observed stochastic rotating shallow water (SRSW) model which has been derived using the Stochastic Advection by Lie Transport (SALT) approach. The methodology we present here validates the applicability of tempering and sample regeneration via a Metropolis-Hastings algorithm to high-dimensional models used i…
▽ More
In this work, we use a tempering-based adaptive particle filter to infer from a partially observed stochastic rotating shallow water (SRSW) model which has been derived using the Stochastic Advection by Lie Transport (SALT) approach. The methodology we present here validates the applicability of tempering and sample regeneration via a Metropolis-Hastings algorithm to high-dimensional models used in stochastic fluid dynamics. The methodology is first tested on the Lorenz '63 model with both full and partial observations. Then we discuss the efficiency of the particle filter the SALT-SRSW model.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.
-
Using Mutual Information to measure Time-lags from non-linear processes in Astronomy
Authors:
Nachiketa Chakraborty,
Peter Jan van Leeuwen
Abstract:
Measuring time lags between time-series or lighcurves at different wavelengths from a variable or transient source in astronomy is an essential probe of physical mechanisms causing multiwavelength variability. Time-lags are typically quantified using discrete correlation functions (DCF) which are appropriate for linear relationships. However, in variable sources like X-ray binaries, active galacti…
▽ More
Measuring time lags between time-series or lighcurves at different wavelengths from a variable or transient source in astronomy is an essential probe of physical mechanisms causing multiwavelength variability. Time-lags are typically quantified using discrete correlation functions (DCF) which are appropriate for linear relationships. However, in variable sources like X-ray binaries, active galactic nuclei (AGN) and other accreting systems, the radiative processes and the resulting multiwavelength lightcurves often have non-linear relationships. For such systems it is more appropriate to use non-linear information-theoretic measures of causation like mutual information, routinely used in other disciplines. We demonstrate with toy models loopholes of using the standard DCF & show improvements when using the mutual information correlation function (MICF). For non-linear correlations, the latter accurately & sharply identifies the lag components as opposed to the DCF which can be erroneous. Following that we apply the MICF to the multiwavelength lightcurves of AGN NGC 4593. We find that X-ray fluxes lead UVW2 fluxes by ~0.2 days, closer to model predictions from reprocessing by the accretion disk than the DCF estimate. The uncertainties with the current lightcurves are too large though to rule out -ve lags. Additionally, we find another delay component at ~-1 day i.e. UVW2 leading X-rays consistent with inward propagating fluctuations in the accretion disk scenario. This is not detected by the DCF. Keeping in mind the non-linear relation between X-ray & UVW2, this is worthy of further theoretical investigation. From both toy models & real observations, it is clear that the mutual information based estimator is highly sensitive to complex non-linear correlations. With sufficiently high temporal resolution, we will precisely detect each of the lag features corresponding to these correlations.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
On time-parallel preconditioning for the state formulation of incremental weak constraint 4D-Var
Authors:
Ieva Daužickaitė,
Amos S. Lawless,
Jennifer A. Scott,
Peter Jan van Leeuwen
Abstract:
Using a high degree of parallelism is essential to perform data assimilation efficiently. The state formulation of the incremental weak constraint four-dimensional variational data assimilation method allows parallel calculations in the time dimension. In this approach, the solution is approximated by minimising a series of quadratic cost functions using the conjugate gradient method. To use this…
▽ More
Using a high degree of parallelism is essential to perform data assimilation efficiently. The state formulation of the incremental weak constraint four-dimensional variational data assimilation method allows parallel calculations in the time dimension. In this approach, the solution is approximated by minimising a series of quadratic cost functions using the conjugate gradient method. To use this method in practice, effective preconditioning strategies that maintain the potential for parallel calculations are needed. We examine approximations to the control variable transform (CVT) technique when the latter is beneficial. The new strategy employs a randomised singular value decomposition and retains the potential for parallelism in the time domain. Numerical results for the Lorenz 96 model show that this approach accelerates the minimisation in the first few iterations, with better results when CVT performs well.
△ Less
Submitted 23 July, 2021; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Randomised preconditioning for the forcing formulation of weak constraint 4D-Var
Authors:
Ieva Daužickaitė,
Amos S. Lawless,
Jennifer A. Scott,
Peter Jan van Leeuwen
Abstract:
There is growing awareness that errors in the model equations cannot be ignored in data assimilation methods such as four-dimensional variational assimilation (4D-Var). If allowed for, more information can be extracted from observations, longer time windows are possible, and the minimisation process is easier, at least in principle. Weak constraint 4D-Var estimates the model error and minimises a…
▽ More
There is growing awareness that errors in the model equations cannot be ignored in data assimilation methods such as four-dimensional variational assimilation (4D-Var). If allowed for, more information can be extracted from observations, longer time windows are possible, and the minimisation process is easier, at least in principle. Weak constraint 4D-Var estimates the model error and minimises a series of linear least-squares cost functionsfunctions, which can be achieved using the conjugate gradient (CG) method; minimising each cost function is called an inner loop. CG needs preconditioning to improve its performance. In previous work, limited memory preconditioners (LMPs) have been constructed using approximations of the eigenvalues and eigenvectors of the Hessian in the previous inner loop. If the Hessian changes significantly in consecutive inner loops, the LMP may be of limited usefulness. To circumvent this, we propose using randomised methods for low rank eigenvalue decomposition and use these approximations to cheaply construct LMPs using information from the current inner loop. Three randomised methods are compared. Numerical experiments in idealized systems show that the resulting LMPs perform better than the existing LMPs. Using these methods may allow more efficient and robust implementations of incremental weak constraint 4D-Var.
△ Less
Submitted 11 May, 2021; v1 submitted 18 January, 2021;
originally announced January 2021.
-
A Framework for Causal Discovery in non-intervenable systems
Authors:
Peter Jan van Leeuwen,
Michael DeCaria,
Nachiketa Chakaborty,
Manuel Pulido
Abstract:
Many frameworks exist to infer cause and effect relations in complex nonlinear systems but a complete theory is lacking. A new framework is presented that is fully nonlinear, provides a complete information theoretic disentanglement of causal processes, allows for nonlinear interactions between causes, identifies the causal strength of missing or unknown processes, and can analyze systems that can…
▽ More
Many frameworks exist to infer cause and effect relations in complex nonlinear systems but a complete theory is lacking. A new framework is presented that is fully nonlinear, provides a complete information theoretic disentanglement of causal processes, allows for nonlinear interactions between causes, identifies the causal strength of missing or unknown processes, and can analyze systems that cannot be represented on Directed Acyclic Graphs. The basic building blocks are information theoretic measures such as (conditional) mutual information and a new concept called certainty that monotonically increases with the information available about the target process. The framework is presented in detail and compared with other existing frameworks, and the treatment of confounders is discussed. While there are systems with structures that the framework cannot disentangle, it is argued that any causal framework that is based on integrated quantities will miss out potentially important information of the underlying probability density functions. The framework is tested on several highly simplified stochastic processes to demonstrate how blocking and gateways are handled, and on the chaotic Lorentz 1963 system. We show that the framework provides information on the local dynamics, but also reveals information on the larger scale structure of the underlying attractor. Furthermore, by applying it to real observations related to the El-Nino-Southern-Oscillation system we demonstrate its power and advantage over other methodologies.
△ Less
Submitted 27 September, 2021; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Ensemble Riemannian Data Assimilation over the Wasserstein Space
Authors:
Sagar K. Tamang,
Ardeshir Ebtehaj,
Peter J. Van Leeuwen,
Dongmian Zou,
Gilad Lerman
Abstract:
In this paper, we present an ensemble data assimilation paradigm over a Riemannian manifold equipped with the Wasserstein metric. Unlike the Eulerian penalization of error in the Euclidean space, the Wasserstein metric can capture translation and difference between the shapes of square-integrable probability distributions of the background state and observations -- enabling to formally penalize ge…
▽ More
In this paper, we present an ensemble data assimilation paradigm over a Riemannian manifold equipped with the Wasserstein metric. Unlike the Eulerian penalization of error in the Euclidean space, the Wasserstein metric can capture translation and difference between the shapes of square-integrable probability distributions of the background state and observations -- enabling to formally penalize geophysical biases in state-space with non-Gaussian distributions. The new approach is applied to dissipative and chaotic evolutionary dynamics and its potential advantages and limitations are highlighted compared to the classic variational and filtering data assimilation approaches under systematic and random errors.
△ Less
Submitted 24 March, 2021; v1 submitted 7 September, 2020;
originally announced September 2020.
-
Model uncertainty estimation using the expectation maximization algorithm and a particle flow filter
Authors:
María Magdalena Lucini,
Peter Jan van Leeuwen,
Manuel Pulido
Abstract:
Model error covariances play a central role in the performance of data assimilation methods applied to nonlinear state-space models. However, these covariances are largely unknown in most of the applications. A misspecification of the model error covariance has a strong impact on the computation of the posterior probability density function, leading to unreliable estimations and even to a total fa…
▽ More
Model error covariances play a central role in the performance of data assimilation methods applied to nonlinear state-space models. However, these covariances are largely unknown in most of the applications. A misspecification of the model error covariance has a strong impact on the computation of the posterior probability density function, leading to unreliable estimations and even to a total failure of the assimilation procedure. In this work, we propose the combination of the Expectation-Maximization algorithm (EM) with an efficient particle filter to estimate the model error covariance, using a batch of observations. Based on the EM algorithm principles, the proposed method encompasses two stages: the expectation stage, in which a particle filter is used with the present estimate of the model error covariance as given to find the probability density function that maximizes the likelihood, followed by a maximization stage in which the expectation under the probability density function found in the expectation step is maximized as a function of the elements of the model error covariance. This novel algorithm here presented combines the EM with a fixed point algorithm and does not require a particle smoother to approximate the posterior densities. We demonstrate that the new method accurately and efficiently solves the linear model problem. Furthermore, for the chaotic nonlinear Lorenz-96 model the method is stable even for observation error covariance 10 times larger than the estimated model error covariance matrix, and also that it is successful in high-dimensional situations where the dimension of the estimated matrix is 1600.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Massively Parallel Implicit Equal-Weights Particle Filter for Ocean Drift Trajectory Forecasting
Authors:
Håvard Heitlo Holm,
Martin Lilleeng Sætra,
Peter Jan van Leeuwen
Abstract:
Forecasting ocean drift trajectories are important for many applications, including search and rescue operations, oil spill cleanup and iceberg risk mitigation. In an operational setting, forecasts of drift trajectories are produced based on computationally demanding forecasts of three-dimensional ocean currents. Herein, we investigate a complementary approach for shorter time scales by using a re…
▽ More
Forecasting ocean drift trajectories are important for many applications, including search and rescue operations, oil spill cleanup and iceberg risk mitigation. In an operational setting, forecasts of drift trajectories are produced based on computationally demanding forecasts of three-dimensional ocean currents. Herein, we investigate a complementary approach for shorter time scales by using a recent state-of-the-art implicit equal-weights particle filter applied to a simplified ocean model. To achieve this, we present a new algorithmic design for a data-assimilation system in which all components - including the model, model errors, and particle filter - take advantage of massively parallel compute architectures, such as graphical processing units. Faster computations can enable in-situ and ad-hoc model runs for emergency management, and larger ensembles for better uncertainty quantification. Using a challenging test case with near-realistic chaotic instabilities, we run data-assimilation experiments based on synthetic observations from drifting and moored buoys, and analyse the trajectory forecasts for the drifters. Our results show that even sparse drifter observations are sufficient to significantly improve short-term drift forecasts up to twelve hours. With equidistant moored buoys observing only 0.1% of the state space, the ensemble gives an accurate description of the true state after data assimilation followed by a high-quality probabilistic forecast.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Spectral estimates for saddle point matrices arising in weak constraint four-dimensional variational data assimilation
Authors:
Ieva Daužickaitė,
Amos S. Lawless,
Jennifer A. Scott,
Peter Jan van Leeuwen
Abstract:
We consider the large-sparse symmetric linear systems of equations that arise in the solution of weak constraint four-dimensional variational data assimilation, a method of high interest for numerical weather prediction. These systems can be written as saddle point systems with a 3x3 block structure but block eliminations can be performed to reduce them to saddle point systems with a 2x2 block str…
▽ More
We consider the large-sparse symmetric linear systems of equations that arise in the solution of weak constraint four-dimensional variational data assimilation, a method of high interest for numerical weather prediction. These systems can be written as saddle point systems with a 3x3 block structure but block eliminations can be performed to reduce them to saddle point systems with a 2x2 block structure, or further to symmetric positive definite systems. In this paper, we analyse how sensitive the spectra of these matrices are to the number of observations of the underlying dynamical system. We also obtain bounds on the eigenvalues of the matrices. Numerical experiments are used to confirm the theoretical analysis and bounds.
△ Less
Submitted 14 May, 2020; v1 submitted 21 August, 2019;
originally announced August 2019.
-
Rainfall nowcasting by combining radars, microwave links and rain gauges
Authors:
Blandine Bianchi,
Peter Jan van Leeuwen,
Robin J. Hogan,
Alexis Berne
Abstract:
The objective of this work is to provide high-resolution rain rate maps at short lead-time forecasts (nowcasts) necessary to anticipate flooding and properly manage sewage systems in urban areas by combining radars, rain gauges, and operational microwave links, and taking into account their respective uncertainties. A variational approach (3D-Var) is used to find the best estimate for the rain rat…
▽ More
The objective of this work is to provide high-resolution rain rate maps at short lead-time forecasts (nowcasts) necessary to anticipate flooding and properly manage sewage systems in urban areas by combining radars, rain gauges, and operational microwave links, and taking into account their respective uncertainties. A variational approach (3D-Var) is used to find the best estimate for the rain rate, and its error covariance, from the different rain sensors. Short-term rain rate forecasts are then produced by assuming Lagrangian persistence. A velocity field is obtained from the operational radar-derived rain fields, and the rain rate field is advected using the Total Variance Diminishing (TVD) scheme. The error covariance associated to the estimated rain rate is also propagated, and we use these two in the 3D-Var at the next observation time step. This approach can be seen as a Variational Kalman Filter (VKF), in which the covariance of the prior is not constant but dependent on time. The proposed approach has been tested using data from 14 rain gauges, 14 microwave links and the operational radar rain product from MeteoSwiss in the area of Zurich (Switzerland). During the applications the assumption of the Lagrangian persistence appears to be valid up to 20 min (a bit longer for stratiform events). During convective events, the algorithm is less powerful and shorter lead times should be considered (i.e., 15 min). Although such lead times are short, they are still useful to various hydrological and outdoor applications.
△ Less
Submitted 28 October, 2018;
originally announced October 2018.
-
Particle filters for high-dimensional geoscience applications: a review
Authors:
Peter Jan van Leeuwen,
Hans R. Künsch,
Lars Nerger,
Roland Potthast,
Sebastian Reich
Abstract:
Particle filters contain the promise of fully nonlinear data assimilation. They have been applied in numerous science areas, but their application to the geosciences has been limited due to their inefficiency in high-dimensional systems in standard settings. However, huge progress has been made, and this limitation is disappearing fast due to recent developments in proposal densities, the use of i…
▽ More
Particle filters contain the promise of fully nonlinear data assimilation. They have been applied in numerous science areas, but their application to the geosciences has been limited due to their inefficiency in high-dimensional systems in standard settings. However, huge progress has been made, and this limitation is disappearing fast due to recent developments in proposal densities, the use of ideas from (optimal) transportation, the use of localisation and intelligent adaptive resampling strategies. Furthermore, powerful hybrids between particle filters and ensemble Kalman filters and variational methods have been developed. We present a state of the art discussion of present efforts of developing particle filters for highly nonlinear geoscience state-estimation problems with an emphasis on atmospheric and oceanic applications, including many new ideas, derivations, and unifications, highlighting hidden connections, and generating a valuable tool and guide for the community. Initial experiments show that particle filters can be competitive with present-day methods for numerical weather prediction suggesting that they will become mainstream soon.
△ Less
Submitted 13 April, 2019; v1 submitted 27 July, 2018;
originally announced July 2018.
-
Multiplicative non-Gaussian model error estimation in data assimilation
Authors:
Sahani Pathiraja,
Peter Jan van Leeuwen
Abstract:
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stochastic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process…
▽ More
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stochastic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process are available, which is typically not the case in real applications. We present a methodology for estimating the statistics of sub-grid scale processes for the more realistic case that only partial observations of the coarse scale process are available. Model error realizations are estimated over a training period by minimizing their conditional sum of squared deviations given some informative covariates (e.g. state of the system), constrained by available observations and assuming that the observation errors are smaller than the model errors. From these realizations a conditional probability distribution of additive model errors given these covariates is obtained, allowing for complex non-Gaussian error structures. Random draws from this density are then used in actual ensemble data assimilation experiments. We demonstrate the efficacy of the approach through numerical experiments with the multi-scale Lorenz 96 system using both small and large time scale separations between slow (coarse scale) and fast (fine scale) variables. The resulting error estimates and forecasts obtained with this new method are superior to those from two existing methods.
△ Less
Submitted 10 April, 2021; v1 submitted 24 July, 2018;
originally announced July 2018.