-
Ensemble Kalman Inversion for Geothermal Reservoir Modelling
Authors:
Alex de Beer,
Elvar K Bjarkason,
Michael Gravatt,
Ruanui Nicholson,
John P O'Sullivan,
Michael J O'Sullivan,
Oliver J Maclaren
Abstract:
Numerical models of geothermal reservoirs typically depend on hundreds or thousands of unknown parameters, which must be estimated using sparse, noisy data. However, these models capture complex physical processes, which frequently results in long run-times and simulation failures, making the process of estimating the unknown parameters a challenging task. Conventional techniques for parameter est…
▽ More
Numerical models of geothermal reservoirs typically depend on hundreds or thousands of unknown parameters, which must be estimated using sparse, noisy data. However, these models capture complex physical processes, which frequently results in long run-times and simulation failures, making the process of estimating the unknown parameters a challenging task. Conventional techniques for parameter estimation and uncertainty quantification, such as Markov chain Monte Carlo (MCMC), can require tens of thousands of simulations to provide accurate results and are therefore challenging to apply in this context. In this paper, we study the ensemble Kalman inversion (EKI) algorithm as an alternative technique for approximate parameter estimation and uncertainty quantification for geothermal reservoir models. EKI possesses several characteristics that make it well-suited to a geothermal setting; it is derivative-free, parallelisable, robust to simulation failures, and requires far fewer simulations than conventional uncertainty quantification techniques such as MCMC. We illustrate the use of EKI in a reservoir modelling context using a combination of synthetic and real-world case studies. Through these case studies, we also demonstrate how EKI can be paired with flexible parametrisation techniques capable of accurately representing prior knowledge of the characteristics of a reservoir and adhering to geological constraints, and how the algorithm can be made robust to simulation failures. Our results demonstrate that EKI provides a reliable and efficient means of obtaining accurate parameter estimates for large-scale, two-phase geothermal reservoir models, with appropriate characterisation of uncertainty.
△ Less
Submitted 16 October, 2024; v1 submitted 11 October, 2024;
originally announced October 2024.
-
Data Space Inversion for Efficient Predictions and Uncertainty Quantification for Geothermal Models
Authors:
Alex de Beer,
Andrew Power,
Daniel Wong,
Ken Dekkers,
Michael Gravatt,
Elvar K. Bjarkason,
John P. O'Sullivan,
Michael J. O'Sullivan,
Oliver J. Maclaren,
Ruanui Nicholson
Abstract:
The ability to make accurate predictions with quantified uncertainty provides a crucial foundation for the successful management of a geothermal reservoir. Conventional approaches for making predictions using geothermal reservoir models involve estimating unknown model parameters using field data, then propagating the uncertainty in these estimates through to the predictive quantities of interest.…
▽ More
The ability to make accurate predictions with quantified uncertainty provides a crucial foundation for the successful management of a geothermal reservoir. Conventional approaches for making predictions using geothermal reservoir models involve estimating unknown model parameters using field data, then propagating the uncertainty in these estimates through to the predictive quantities of interest. However, the unknown parameters are not always of direct interest; instead, the predictions are of primary importance. Data space inversion (DSI) is an alternative methodology that allows for the efficient estimation of predictive quantities of interest, with quantified uncertainty, that avoids the need to estimate model parameters entirely. In this paper, we evaluate the applicability of DSI to geothermal reservoir modelling. We first review the processes of model calibration, prediction and uncertainty quantification from a Bayesian perspective, and introduce data space inversion as a simple, efficient technique for approximating the posterior predictive distribution. We then apply the DSI framework to two model problems in geothermal reservoir modelling. We evaluate the accuracy and efficiency of DSI relative to other common methods for uncertainty quantification, study how the number of reservoir model simulations affects the resulting approximation to the posterior predictive distribution, and demonstrate how the framework can be enhanced through the use of suitable reparametrisations. Our results support the idea that data space inversion is a simple, robust and efficient technique for making predictions with quantified uncertainty using geothermal reservoir models, providing a useful alternative to more conventional approaches.
△ Less
Submitted 22 August, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
The impact of Covid-19 vaccination in Aotearoa New Zealand: a modelling study
Authors:
Samik Datta,
Giorgia Vattiato,
Oliver J Maclaren,
Ning Hua,
Andrew Sporle,
Michael J Plank
Abstract:
Aotearoa New Zealand implemented a Covid-19 elimination strategy in 2020 and 2021, which enabled a large majority of the population to be vaccinated before being exposed to the virus. This strategy delivered one of the lowest pandemic mortality rates in the world. However, quantitative estimates of the population-level health benefits of vaccination are lacking. Here, we use a validated mathematic…
▽ More
Aotearoa New Zealand implemented a Covid-19 elimination strategy in 2020 and 2021, which enabled a large majority of the population to be vaccinated before being exposed to the virus. This strategy delivered one of the lowest pandemic mortality rates in the world. However, quantitative estimates of the population-level health benefits of vaccination are lacking. Here, we use a validated mathematical model to investigate counterfactual scenarios with differing levels of vaccine coverage in different age and ethnicity groups. The model builds on earlier research by adding age- and time-dependent case ascertainment, the effect of antiviral medications, improved hospitalisation rate estimates, and the impact of relaxing control measures. The model was used for scenario analysis and policy advice for the New Zealand Government in 2022 and 2023. We compare the number of Covid-19 hospitalisations, deaths, and years of life lost in each counterfactual scenario to a baseline scenario that is fitted to epidemiological data between January 2022 and June 2023. Our results estimate that vaccines saved 6650 (95% credible interval [4424, 10180]) lives, and prevented 74500 [51000, 115400] years of life lost and 45100 [34400, 55600] hospitalisations during this 18-month period. Making the same comparison before the benefit of antiviral medications is accounted for, the estimated number of lives saved by vaccines increases to 7604 [5080, 11942]. Due to inequities in the vaccine rollout, vaccination rates among Māori were lower than in people of European ethnicity. Our results show that, if vaccination rates had been equitable, an estimated 11-26% of the 292 Māori Covid-19 deaths that were recorded in this time period could have been prevented. We conclude that Covid-19 vaccination greatly reduced health burden in New Zealand and that equity needs to be a key focus of future vaccination programmes.
△ Less
Submitted 25 January, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Implementing measurement error models with mechanistic mathematical models in a likelihood-based framework for estimation, identifiability analysis, and prediction in the life sciences
Authors:
Ryan J. Murphy,
Oliver J. Maclaren,
Matthew J. Simpson
Abstract:
Throughout the life sciences we routinely seek to interpret measurements and observations using parameterised mechanistic mathematical models. A fundamental and often overlooked choice in this approach involves relating the solution of a mathematical model with noisy and incomplete measurement data. This is often achieved by assuming that the data are noisy measurements of the solution of a determ…
▽ More
Throughout the life sciences we routinely seek to interpret measurements and observations using parameterised mechanistic mathematical models. A fundamental and often overlooked choice in this approach involves relating the solution of a mathematical model with noisy and incomplete measurement data. This is often achieved by assuming that the data are noisy measurements of the solution of a deterministic mathematical model, and that measurement errors are additive and normally distributed. While this assumption of additive Gaussian noise is extremely common and simple to implement and interpret, it is often unjustified and can lead to poor parameter estimates and non-physical predictions. One way to overcome this challenge is to implement a different measurement error model. In this review, we demonstrate how to implement a range of measurement error models in a likelihood-based framework for estimation, identifiability analysis, and prediction, called Profile-Wise Analysis. This frequentist approach to uncertainty quantification for mechanistic models leverages the profile likelihood for targeting parameters and understanding their influence on predictions. Case studies, motivated by simple caricature models routinely used in systems biology and mathematical biology literature, illustrate how the same ideas apply to different types of mathematical models. Open-source Julia code to reproduce results is available on GitHub.
△ Less
Submitted 8 November, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Generalised likelihood profiles for models with intractable likelihoods
Authors:
David J. Warne,
Oliver J. Maclaren,
Elliot J. Carr,
Matthew J. Simpson,
Christopher Drovandi
Abstract:
Likelihood profiling is an efficient and powerful frequentist approach for parameter estimation, uncertainty quantification and practical identifiablity analysis. Unfortunately, these methods cannot be easily applied for stochastic models without a tractable likelihood function. Such models are typical in many fields of science, rendering these classical approaches impractical in these settings. T…
▽ More
Likelihood profiling is an efficient and powerful frequentist approach for parameter estimation, uncertainty quantification and practical identifiablity analysis. Unfortunately, these methods cannot be easily applied for stochastic models without a tractable likelihood function. Such models are typical in many fields of science, rendering these classical approaches impractical in these settings. To address this limitation, we develop a new approach to generalising the methods of likelihood profiling for situations when the likelihood cannot be evaluated but stochastic simulations of the assumed data generating process are possible. Our approach is based upon recasting developments from generalised Bayesian inference into a frequentist setting. We derive a method for constructing generalised likelihood profiles and calibrating these profiles to achieve desired frequentist coverage for a given coverage level. We demonstrate the performance of our method on realistic examples from the literature and highlight the capability of our approach for the purpose of practical identifability analysis for models with intractable likelihoods.
△ Less
Submitted 19 May, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Likelihood-based estimation and prediction for a measles outbreak in Samoa
Authors:
David Wu,
Helen Petousis-Harris,
Janine Paynter,
Vinod Suresh,
Oliver J. Maclaren
Abstract:
Prediction of the progression of an infectious disease outbreak is important for planning and coordinating a response. Differential equations are often used to model an epidemic outbreak's behaviour but are challenging to parameterise. Furthermore, these models can suffer from misspecification, which biases predictions and parameter estimates. Stochastic models can help with misspecification but a…
▽ More
Prediction of the progression of an infectious disease outbreak is important for planning and coordinating a response. Differential equations are often used to model an epidemic outbreak's behaviour but are challenging to parameterise. Furthermore, these models can suffer from misspecification, which biases predictions and parameter estimates. Stochastic models can help with misspecification but are even more expensive to simulate and perform inference with. Here, we develop an explicitly likelihood-based variation of the generalised profiling method as a tool for prediction and inference under model misspecification. Our approach allows us to carry out identifiability analysis and uncertainty quantification using profile likelihood-based methods without the need for marginalisation. We provide justification for this approach by introducing a new interpretation of the model approximation component as a stochastic constraint. This preserves the rationale for using profiling rather than integration to remove nuisance parameters while also providing a link back to stochastic models. We applied an initial version of this method during an outbreak of measles in Samoa in 2019-2020 and found that it achieved relatively fast, accurate predictions. Here we present the most recent version of our method and its application to this measles outbreak, along with additional validation.
△ Less
Submitted 15 March, 2022; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Profile likelihood analysis for a stochastic model of diffusion in heterogeneous media
Authors:
Matthew J Simpson,
Alexander P Browning,
Christopher Drovandi,
Elliot J Carr,
Oliver J Maclaren,
Ruth E Baker
Abstract:
We compute profile likelihoods for a stochastic model of diffusive transport motivated by experimental observations of heat conduction in layered skin tissues. This process is modelled as a random walk in a layered one-dimensional material, where each layer has a distinct particle hopping rate. Particles are released at some location, and the duration of time taken for each particle to reach an ab…
▽ More
We compute profile likelihoods for a stochastic model of diffusive transport motivated by experimental observations of heat conduction in layered skin tissues. This process is modelled as a random walk in a layered one-dimensional material, where each layer has a distinct particle hopping rate. Particles are released at some location, and the duration of time taken for each particle to reach an absorbing boundary is recorded. To explore whether this data can be used to identify the hopping rates in each layer, we compute various profile likelihoods using two methods: first, an exact likelihood is evaluated using a relatively expensive Markov chain approach; and, second we form an approximate likelihood by assuming the distribution of exit times is given by a Gamma distribution whose first two moments match the expected moments from the continuum limit description of the stochastic model. Using the exact and approximate likelihoods we construct various profile likelihoods for a range of problems. In cases where parameter values are not identifiable, we make progress by re-interpreting those data with a reduced model with a smaller number of layers.
△ Less
Submitted 9 March, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
What can be estimated? Identifiability, estimability, causal inference and ill-posed inverse problems
Authors:
Oliver J. Maclaren,
Ruanui Nicholson
Abstract:
We consider basic conceptual questions concerning the relationship between statistical estimation and causal inference. Firstly, we show how to translate causal inference problems into an abstract statistical formalism without requiring any structure beyond an arbitrarily-indexed family of probability models. The formalism is simple but can incorporate a variety of causal modelling frameworks, inc…
▽ More
We consider basic conceptual questions concerning the relationship between statistical estimation and causal inference. Firstly, we show how to translate causal inference problems into an abstract statistical formalism without requiring any structure beyond an arbitrarily-indexed family of probability models. The formalism is simple but can incorporate a variety of causal modelling frameworks, including 'structural causal models', but also models expressed in terms of, e.g., differential equations. We focus primarily on the structural/graphical causal modelling literature, however. Secondly, we consider the extent to which causal and statistical concerns can be cleanly separated, examining the fundamental question: 'What can be estimated from data?'. We call this the problem of estimability. We approach this by analysing a standard formal definition of 'can be estimated' commonly adopted in the causal inference literature -- identifiability -- in our abstract statistical formalism. We use elementary category theory to show that identifiability implies the existence of a Fisher-consistent estimator, but also show that this estimator may be discontinuous, and thus unstable, in general. This difficulty arises because the causal inference problem is, in general, an ill-posed inverse problem. Inverse problems have three conditions which must be satisfied to be considered well-posed: existence, uniqueness, and stability of solutions. Here identifiability corresponds to the question of uniqueness; in contrast, we take estimability to mean satisfaction of all three conditions, i.e. well-posedness. Lack of stability implies that naive translation of a causally identifiable quantity into an achievable statistical estimation target may prove impossible. Our article is primarily expository and aimed at unifying ideas from multiple fields, though we provide new constructions and proofs.
△ Less
Submitted 20 July, 2020; v1 submitted 4 April, 2019;
originally announced April 2019.
-
Incorporating Posterior-Informed Approximation Errors into a Hierarchical Framework to Facilitate Out-of-the-Box MCMC Sampling for Geothermal Inverse Problems and Uncertainty Quantification
Authors:
Oliver J. Maclaren,
Ruanui Nicholson,
Elvar K. Bjarkason,
John P. O'Sullivan,
Michael J. O'Sullivan
Abstract:
We consider geothermal inverse problems and uncertainty quantification from a Bayesian perspective. Our main goal is to make standard, `out-of-the-box' Markov chain Monte Carlo (MCMC) sampling more feasible for complex simulation models by using suitable approximations. To do this, we first show how to pose both the inverse and prediction problems in a hierarchical Bayesian framework. We then show…
▽ More
We consider geothermal inverse problems and uncertainty quantification from a Bayesian perspective. Our main goal is to make standard, `out-of-the-box' Markov chain Monte Carlo (MCMC) sampling more feasible for complex simulation models by using suitable approximations. To do this, we first show how to pose both the inverse and prediction problems in a hierarchical Bayesian framework. We then show how to incorporate so-called posterior-informed model approximation error into this hierarchical framework, using a modified form of the Bayesian approximation error (BAE) approach. This enables the use of a `coarse', approximate model in place of a finer, more expensive model, while accounting for the additional uncertainty and potential bias that this can introduce. Our method requires only simple probability modelling, a relatively small number of fine model simulations, and only modifies the target posterior -- any standard MCMC sampling algorithm can be used to sample the new posterior. These corrections can also be used in methods that are not based on MCMC sampling. We show that our approach can achieve significant computational speed-ups on two geothermal test problems. We also demonstrate the dangers of naively using coarse, approximate models in place of finer models, without accounting for the induced approximation errors. The naive approach tends to give overly confident and biased posteriors while incorporating BAE into our hierarchical framework corrects for this while maintaining computational efficiency and ease-of-use.
△ Less
Submitted 19 December, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Is profile likelihood a true likelihood? An argument in favor
Authors:
Oliver J. Maclaren
Abstract:
Profile likelihood is the key tool for dealing with nuisance parameters in likelihood theory. It is often asserted, however, that profile likelihood is not a 'true' likelihood. One implication is that likelihood theory lacks the generality of e.g. Bayesian inference, wherein marginalization is the universal tool for dealing with nuisance parameters. Here we argue that profile likelihood has as muc…
▽ More
Profile likelihood is the key tool for dealing with nuisance parameters in likelihood theory. It is often asserted, however, that profile likelihood is not a 'true' likelihood. One implication is that likelihood theory lacks the generality of e.g. Bayesian inference, wherein marginalization is the universal tool for dealing with nuisance parameters. Here we argue that profile likelihood has as much claim to being a true likelihood as a marginal probability has to being a true probability distribution. The crucial point we argue is that a likelihood function is naturally interpreted as a maxitive possibility measure: given this, the associated theory of integration with respect to maxitive measures delivers profile likelihood as the direct analogue of marginal probability in additive measure theory. Thus, given a background likelihood function, we argue that profiling over the likelihood function is as natural (or as unnatural, as the case may be) as marginalizing over a background probability measure. The connections to Bayesian inference can also be further clarified with the introduction of a suitable logarithmic distance function, in which case the present theory can be naturally described as 'Tropical Bayes' in the sense of tropical algebra.
△ Less
Submitted 5 July, 2018; v1 submitted 12 January, 2018;
originally announced January 2018.
-
Randomized Truncated SVD Levenberg-Marquardt Approach to Geothermal Natural State and History Matching
Authors:
Elvar K. Bjarkason,
Oliver J. Maclaren,
John P. O'Sullivan,
Michael J. O'Sullivan
Abstract:
The Levenberg-Marquardt (LM) method is commonly used for inverting models used to describe geothermal, groundwater, or oil and gas reservoirs. In previous studies LM parameter updates have been made tractable for highly parameterized inverse problems with large data sets by applying matrix factorization methods or iterative linear solvers to approximately solve the update equations.
Some studies…
▽ More
The Levenberg-Marquardt (LM) method is commonly used for inverting models used to describe geothermal, groundwater, or oil and gas reservoirs. In previous studies LM parameter updates have been made tractable for highly parameterized inverse problems with large data sets by applying matrix factorization methods or iterative linear solvers to approximately solve the update equations.
Some studies have shown that basing model updates on the truncated singular value decomposition (TSVD) of a dimensionless sensitivity matrix achieved using Lanczos iteration can speed up the inversion of reservoir models. Lanczos iterations only require the sensitivity matrix times a vector and its transpose times a vector, which are found efficiently using adjoint and direct simulations without the expense of forming a large sensitivity matrix.
Nevertheless, Lanczos iteration has the drawback of being a serial process, requiring a separate adjoint solve and direct solve every Lanczos iteration. Randomized methods, developed for low-rank matrix approximation of large matrices, are more efficient alternatives to the standard Lanczos method. Here we develop LM variants which use randomized methods to find a TSVD of a dimensionless sensitivity matrix when updating parameters. The randomized approach offers improved efficiency by enabling simultaneous solution of all adjoint and direct problems for a parameter update.
△ Less
Submitted 3 October, 2017;
originally announced October 2017.
-
Models, measurement and inference in epithelial tissue dynamics
Authors:
O. J. Maclaren,
A. G. Fletcher,
H. M. Byrne,
P. K. Maini
Abstract:
The majority of solid tumours arise in epithelia and therefore much research effort has gone into investigating the growth, renewal and regulation of these tissues. Here we review different mathematical and computational approaches that have been used to model epithelia. We compare different models and describe future challenges that need to be overcome in order to fully exploit new data which pre…
▽ More
The majority of solid tumours arise in epithelia and therefore much research effort has gone into investigating the growth, renewal and regulation of these tissues. Here we review different mathematical and computational approaches that have been used to model epithelia. We compare different models and describe future challenges that need to be overcome in order to fully exploit new data which present, for the first time, the real possibility for detailed model validation and comparison.
△ Less
Submitted 16 June, 2015;
originally announced June 2015.