-
Spatial Latent Gaussian Modelling with Change of Support
Authors:
Erick A. Chacón-Montalván,
Peter M. Atkinson,
Christopher Nemeth,
Benjamin M. Taylor,
Paula Moraga
Abstract:
Spatial data are often derived from multiple sources (e.g. satellites, in-situ sensors, survey samples) with different supports, but associated with the same properties of a spatial phenomenon of interest. It is common for predictors to also be measured on different spatial supports than the response variables. Although there is no standard way to work with spatial data with different supports, a…
▽ More
Spatial data are often derived from multiple sources (e.g. satellites, in-situ sensors, survey samples) with different supports, but associated with the same properties of a spatial phenomenon of interest. It is common for predictors to also be measured on different spatial supports than the response variables. Although there is no standard way to work with spatial data with different supports, a prevalent approach used by practitioners has been to use downscaling or interpolation to project all the variables of analysis towards a common support, and then using standard spatial models. The main disadvantage with this approach is that simple interpolation can introduce biases and, more importantly, the uncertainty associated with the change of support is not taken into account in parameter estimation. In this article, we propose a Bayesian spatial latent Gaussian model that can handle data with different rectilinear supports in both the response variable and predictors. Our approach allows to handle changes of support more naturally according to the properties of the spatial stochastic process being used, and to take into account the uncertainty from the change of support in parameter estimation and prediction. We use spatial stochastic processes as linear combinations of basis functions where Gaussian Markov random fields define the weights. Our hierarchical modelling approach can be described by the following steps: (i) define a latent model where response variables and predictors are considered as latent stochastic processes with continuous support, (ii) link the continuous-index set stochastic processes with its projection to the support of the observed data, (iii) link the projected process with the observed data. We show the applicability of our approach by simulation studies and modelling land suitability for improved grassland in Rhondda Cynon Taf, a county borough in Wales.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Malaria Risk Mapping Using Routine Health System Incidence Data in Zambia
Authors:
Benjamin M. Taylor,
Ricardo Andrade-Pacheco,
Hugh Sturrock,
Busiku Hamainza,
Kafula Silumbe,
John Miller,
Thomas P. Eisele,
Francois Rerolle,
Hannah Slater,
Adam Bennett
Abstract:
Improvements to Zambia's malaria surveillance system allow better monitoring of incidence and targetting of responses at refined spatial scales. As transmission decreases, understanding heterogeneity in risk at fine spatial scales becomes increasingly important. However, there are challenges in using health system data for high-resolution risk mapping: health facilities have undefined and overlapp…
▽ More
Improvements to Zambia's malaria surveillance system allow better monitoring of incidence and targetting of responses at refined spatial scales. As transmission decreases, understanding heterogeneity in risk at fine spatial scales becomes increasingly important. However, there are challenges in using health system data for high-resolution risk mapping: health facilities have undefined and overlapping catchment areas, and report on an inconsistent basis. We propose a novel inferential framework for risk mapping of malaria incidence data based on formal down-scaling of confirmed case data reported through the health system in Zambia. We combine data from large community intervention trials in 2011-2016 and model health facility catchments based upon treatment-seeking behaviours; our model for monthly incidence is an aggregated log-Gaussian Cox process, which allows us to predict incidence at fine scale. We predicted monthly malaria incidence at 5km$^2$ resolution nationally: whereas 4.8 million malaria cases were reported through the health system in 2016, we estimated that the number of cases occurring at the community level was closer to 10 million. As Zambia continues to scale up community-based reporting of malaria incidence, these outputs provide realistic estimates of community-level malaria burden as well as high resolution risk maps for targeting interventions at the sub-catchment level.
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
A Multi-Way Correlation Coefficient
Authors:
Benjamin M. Taylor
Abstract:
Pearson's correlation is an important summary measure of the amount of dependence between two variables. It is natural to want to generalise the concept of correlation as a single number that measures the inter-relatedness of three or more variables e.g. how `correlated' are a collection of variables in which non are specifically to be treated as an `outcome'? In this short article, we introduce s…
▽ More
Pearson's correlation is an important summary measure of the amount of dependence between two variables. It is natural to want to generalise the concept of correlation as a single number that measures the inter-relatedness of three or more variables e.g. how `correlated' are a collection of variables in which non are specifically to be treated as an `outcome'? In this short article, we introduce such a measure, and show that it reduces to the modulus of Pearson's $r$ in the two dimensional case.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
A Model-Based General Alternative to the Standardised Precipitation Index
Authors:
Erick A. Chacón-Montalván,
Luke Parry,
Gemma Davies,
Benjamin M. Taylor
Abstract:
In this paper, we introduce two new model-based versions of the widely-used standardized precipitation index (SPI) for detecting and quantifying the magnitude of extreme hydro-climatic events. Our analytical approach is based on generalized additive models for location, scale and shape (GAMLSS), which helps as to overcome some limitations of the SPI. We compare our model-based standardised indices…
▽ More
In this paper, we introduce two new model-based versions of the widely-used standardized precipitation index (SPI) for detecting and quantifying the magnitude of extreme hydro-climatic events. Our analytical approach is based on generalized additive models for location, scale and shape (GAMLSS), which helps as to overcome some limitations of the SPI. We compare our model-based standardised indices (MBSIs) with the SPI using precipitation data collected between January 2004 - December 2013 (522 weeks) in Caapiranga, a road-less municipality of Amazonas State. As a result, it is shown that the MBSI-1 is an index with similar properties to the SPI, but with improved methodology. In comparison to the SPI, our MBSI-1 index allows for the use of different zero-augmented distributions, it works with more flexible time-scales, can be applied to shorter records of data and also takes into account temporal dependencies in known seasonal behaviours. Our approach is implemented in an R package, mbsi, available from Github.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
Spatial Item Factor Analysis With Application to Mapping Food Insecurity
Authors:
Erick Chacon,
Luke Parry,
Emanuele Giorgi,
Patricia Torres,
Jesem Orellana,
Benjamin M. Taylor
Abstract:
Item factor analysis is widely used for studying the relationship between a latent construct and a set of observed variables. One of the main assumptions of this method is that the latent construct or factor is independent between subjects, which might not be adequate in certain contexts. In the study of food insecurity, for example, this is likely not true due to a close relationship with socio-e…
▽ More
Item factor analysis is widely used for studying the relationship between a latent construct and a set of observed variables. One of the main assumptions of this method is that the latent construct or factor is independent between subjects, which might not be adequate in certain contexts. In the study of food insecurity, for example, this is likely not true due to a close relationship with socio-economic characteristics, that are spatially structured. In order to capture these effects, we propose an extension of item factor analysis to the spatial domain that is able to predict the latent factors at unobserved spatial locations. We develop a Bayesian sampling scheme for providing inference and illustrate the explanatory strength of our model by application to a study of the latent construct `food insecurity' in a remote urban centre in the Brazilian Amazon. We use our method to map the dimensions of food insecurity in this area and identify the most severely affected areas. Our methods are implemented in an R package, spifa, available from Github.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Continuous Inference for Aggregated Point Process Data
Authors:
Benjamin M. Taylor,
Ricardo Andrade-Pacheco,
Hugh J. W. Sturrock
Abstract:
This article introduces new methods for inference with count data registered on a set of aggregation units. Such data are omnipresent in epidemiology due to confidentiality issues: it is much more common to know the county in which an individual resides, say, than know their exact location in space. Inference for aggregated data has traditionally made use of models for discrete spatial variation,…
▽ More
This article introduces new methods for inference with count data registered on a set of aggregation units. Such data are omnipresent in epidemiology due to confidentiality issues: it is much more common to know the county in which an individual resides, say, than know their exact location in space. Inference for aggregated data has traditionally made use of models for discrete spatial variation, for example conditional autoregressive models (CAR). We argue that such discrete models can be improved from both a scientific and inferential perspective by using spatiotemporally continuous models to directly model the aggregated counts. We introduce methods for delivering (limiting) continuous inference with spatitemporal aggregated count data in which the aggregation units might change over time and are subject to uncertainty. We illustrate our methods using two examples: from epidemiology, spatial prediction malaria incidence in Namibia; and from politics, forecasting voting under the proposed changes to parlimentary boundaries in the United Kingdom.
△ Less
Submitted 19 April, 2017;
originally announced April 2017.
-
Spatial Modelling of Emergency Service Response Times
Authors:
Benjamin M. Taylor
Abstract:
This article concerns the statistical modelling of emergency service response times. We apply advanced methods from spatial survival analysis to deliver inference for data collected by the London Fire Brigade on response times to reported dwelling fires. Existing approaches to the analysis of these data have been mainly descriptive; we describe and demonstrate the advantages of a more sophisticate…
▽ More
This article concerns the statistical modelling of emergency service response times. We apply advanced methods from spatial survival analysis to deliver inference for data collected by the London Fire Brigade on response times to reported dwelling fires. Existing approaches to the analysis of these data have been mainly descriptive; we describe and demonstrate the advantages of a more sophisticated approach. Our final parametric proportional hazards model includes harmonic regression terms to describe how response time varies with time-of-day and shared spatially correlated frailties on an auxiliary grid for computational efficiency.
We investigate the short-term impact of fire station closures in 2014. Whilst the London Fire Brigade are working hard to keep response times down, our findings suggest there is a limit to what can be achieved logistically: the present article identifies areas around the now closed Belsize, Downham, Kingsland, Knightsbridge, Silvertown, Southwark, Wesminster and Woolwich fire stations in which there should perhaps be some concern as to the provision of fire services.
△ Less
Submitted 26 March, 2015;
originally announced March 2015.
-
Auxiliary Variable Markov Chain Monte Carlo for Spatial Survival and Geostatistical Models
Authors:
Benjamin M. Taylor
Abstract:
This article was motivated by the desire to improve Markov chain Monte Carlo methods for spatial survival models in which the locations of individuals in space are known. For a dataset comprising information on n individuals, standard methods of MCMC-based inference involve computing the inverse of an n by n matrix at each iteration. However with a judicious choice of auxiliary variables on a regu…
▽ More
This article was motivated by the desire to improve Markov chain Monte Carlo methods for spatial survival models in which the locations of individuals in space are known. For a dataset comprising information on n individuals, standard methods of MCMC-based inference involve computing the inverse of an n by n matrix at each iteration. However with a judicious choice of auxiliary variables on a regular grid with m prediction points it will be shown how to fit an essentially equivalent model but with a substantially reduced computational cost. For a fixed output grid, the computational cost of the new method is reduced from O(n^3) to O(n); the cost of increasing the output grid size being O(m\log m). Furthermore, the new method simultaneously solves the problem of spatial prediction of functions of the latent field, which for standard methods usually presents a further computational challenge. We apply the new method to a spatial survival dataset previously analysed in Henderson et. al 2002 and show how the new method can be applied to spatial and spatiotemporal geostatistical datasets with the same computational benefits.
△ Less
Submitted 7 January, 2015;
originally announced January 2015.
-
Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm
Authors:
Peter J. Diggle,
Paula Moraga,
Barry Rowlingson,
Benjamin M. Taylor
Abstract:
In this paper we first describe the class of log-Gaussian Cox processes (LGCPs) as models for spatial and spatio-temporal point process data. We discuss inference, with a particular focus on the computational challenges of likelihood-based inference. We then demonstrate the usefulness of the LGCP by describing four applications: estimating the intensity surface of a spatial point process; investig…
▽ More
In this paper we first describe the class of log-Gaussian Cox processes (LGCPs) as models for spatial and spatio-temporal point process data. We discuss inference, with a particular focus on the computational challenges of likelihood-based inference. We then demonstrate the usefulness of the LGCP by describing four applications: estimating the intensity surface of a spatial point process; investigating spatial segregation in a multi-type process; constructing spatially continuous maps of disease risk from spatially discrete data; and real-time health surveillance. We argue that problems of this kind fit naturally into the realm of geostatistics, which traditionally is defined as the study of spatially continuous processes using spatially discrete observations at a finite number of locations. We suggest that a more useful definition of geostatistics is by the class of scientific problems that it addresses, rather than by particular models or data formats.
△ Less
Submitted 23 December, 2013;
originally announced December 2013.
-
INLA or MCMC? A Tutorial and Comparative Evaluation for Spatial Prediction in log-Gaussian Cox Processes
Authors:
Benjamin M. Taylor,
Peter J. Diggle
Abstract:
We investigate two options for performing Bayesian inference on spatial log-Gaussian Cox processes assuming a spatially continuous latent field: Markov chain Monte Carlo (MCMC) and the integrated nested Laplace approximation (INLA). We first describe the device of approximating a spatially continuous Gaussian field by a Gaussian Markov random field on a discrete lattice, and present a simulation s…
▽ More
We investigate two options for performing Bayesian inference on spatial log-Gaussian Cox processes assuming a spatially continuous latent field: Markov chain Monte Carlo (MCMC) and the integrated nested Laplace approximation (INLA). We first describe the device of approximating a spatially continuous Gaussian field by a Gaussian Markov random field on a discrete lattice, and present a simulation study showing that, with careful choice of parameter values, small neighbourhood sizes can give excellent approximations. We then introduce the spatial log-Gaussian Cox process and describe MCMC and INLA methods for spatial prediction within this model class. We report the results of a simulation study in which we compare MALA and the technique of approximating the continuous latent field by a discrete one, followed by approximate Bayesian inference via INLA over a selection of 18 simulated scenarios. The results question the notion that the latter technique is both significantly faster and more robust than MCMC in this setting; 100,000 iterations of the MALA algorithm running in 20 minutes on a desktop PC delivered greater predictive accuracy than the default \verb=INLA= strategy, which ran in 4 minutes and gave comparative performance to the full Laplace approximation which ran in 39 minutes.
△ Less
Submitted 19 March, 2012; v1 submitted 8 February, 2012;
originally announced February 2012.
-
lgcp An R Package for Inference with Spatio-Temporal Log-Gaussian Cox Processes
Authors:
Benjamin M. Taylor,
Tilman M. Davies,
Barry S. Rowlingson,
Peter J. Diggle
Abstract:
This paper introduces an R package for spatio-temporal prediction and forecasting for log-Gaussian Cox processes. The main computational tool for these models is Markov chain Monte Carlo and the new package, lgcp, therefore also provides an extensible suite of functions for implementing MCMC algorithms for processes of this type. The modelling framework and details of inferential procedures are fi…
▽ More
This paper introduces an R package for spatio-temporal prediction and forecasting for log-Gaussian Cox processes. The main computational tool for these models is Markov chain Monte Carlo and the new package, lgcp, therefore also provides an extensible suite of functions for implementing MCMC algorithms for processes of this type. The modelling framework and details of inferential procedures are first presented before a tour of lgcp functionality is given via a walk-through data-analysis. Topics covered include reading in and converting data, estimation of the key components and parameters of the model, specifying output and simulation quantities, computation of Monte Carlo expectations, post-processing and simulation of data sets.
△ Less
Submitted 27 October, 2011;
originally announced October 2011.
-
On Estimating the Ability of NBA Players
Authors:
Paul Fearnhead,
Benjamin M. Taylor
Abstract:
This paper introduces a new model and methodology for estimating the ability of NBA players. The main idea is to directly measure how good a player is by comparing how their team performs when they are on the court as opposed to when they are off it. This is achieved in a such a way as to control for the changing abilities of the other players on court at different times during a match. The new me…
▽ More
This paper introduces a new model and methodology for estimating the ability of NBA players. The main idea is to directly measure how good a player is by comparing how their team performs when they are on the court as opposed to when they are off it. This is achieved in a such a way as to control for the changing abilities of the other players on court at different times during a match. The new method uses multiple seasons' data in a structured way to estimate player ability in an isolated season, measuring separately defensive and offensive merit as well as combining these to give an overall rating. The use of game statistics in predicting player ability will be considered. Results using data from the 2008/9 season suggest that LeBron James, who won the NBA MVP award, was the best overall player. The best defensive player was Lamar Odom and the best rookie was Russell Westbrook, neither of whom won an NBA award that season. The results further indicate that whilst the frequently-reported game statistics provide some information on offensive ability, they do not perform well in the prediction of defensive ability.
△ Less
Submitted 4 August, 2010;
originally announced August 2010.
-
An Adaptive Sequential Monte Carlo Sampler
Authors:
Paul Fearnhead,
Benjamin M. Taylor
Abstract:
Sequential Monte Carlo (SMC) methods are not only a popular tool in the analysis of state space models, but offer an alternative to MCMC in situations where Bayesian inference must proceed via simulation. This paper introduces a new SMC method that uses adaptive MCMC kernels for particle dynamics. The proposed algorithm features an online stochastic optimization procedure to select the best MCMC k…
▽ More
Sequential Monte Carlo (SMC) methods are not only a popular tool in the analysis of state space models, but offer an alternative to MCMC in situations where Bayesian inference must proceed via simulation. This paper introduces a new SMC method that uses adaptive MCMC kernels for particle dynamics. The proposed algorithm features an online stochastic optimization procedure to select the best MCMC kernel and simultaneously learn optimal tuning parameters. Theoretical results are presented that justify the approach and give guidance on how it should be implemented. Empirical results, based on analysing data from mixture models, show that the new adaptive SMC algorithm (ASMC) can both choose the best MCMC kernel, and learn an appropriate scaling for it. ASMC with a choice between kernels outperformed the adaptive MCMC algorithm of Haario et al. (1998) in 5 out of the 6 cases considered.
△ Less
Submitted 10 May, 2010; v1 submitted 7 May, 2010;
originally announced May 2010.