Skip to main content
Log in

Vector Autoregressive Models with Spatially Structured Coefficients for Time Series on a Spatial Grid

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

Motivated by the need to analyze readily available data collected in space and time, especially in environmental sciences, we propose a parsimonious spatiotemporal model for time series data on a spatial grid. In essence, our model is a vector autoregressive model that utilizes the spatial structure to achieve parsimony of autoregressive matrices at two levels. The first level ensures the sparsity of the autoregressive matrices using a lagged-neighborhood scheme. The second level performs a spatial clustering of the nonzero autoregressive coefficients such that within some subregions, nearby locations share the same autoregressive coefficients while across different subregions the coefficients may have distinct values. The model parameters are estimated using the penalized maximum likelihood with an adaptive fused Lasso penalty. The estimation procedure can be tailored to accommodate the need and prior knowledge of a modeler. Performance of the proposed estimation algorithm is examined in a simulation study. Our method gives reliable estimation results that are interpretable and especially useful to identify geographical subregions, within each of which, the time series have similar dynamical behavior with homogeneous autoregressive coefficients. We apply our model to a wind speed time series dataset generated from a climate model over Saudi Arabia to illustrate its power in explaining the dynamics by the spatially structured coefficients. Moreover, the estimated model can be useful for building stochastic weather generators as an approximation of the computationally expensive climate model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Ailliot P, Monbet V, Prevosto M (2006) An autoregressive model with time-varying coefficients for wind fields. Environmetrics 17(2):107–117

    Article  MathSciNet  Google Scholar 

  • Arnold TB, Tibshirani RJ (2020) genlasso: path algorithm for generalized Lasso problems. R Package Vers 1:5

    Google Scholar 

  • Bańbura M, Giannone D, Reichlin L (2010) Large Bayesian vector auto regressions. J Appl Econ 25(1):71–92

    Article  MathSciNet  Google Scholar 

  • Basu S, Michailidis G (2015) Regularized estimation in sparse high-dimensional time series models. Ann Stat 43(4):1535–1567

    Article  MathSciNet  Google Scholar 

  • Bergmeir C, Benítez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf Sci 191:192–213

    Article  Google Scholar 

  • Bessac J, Ailliot P, Monbet V (2015) Gaussian linear state-space model for wind fields in the North-East Atlantic. Environmetrics 26(1):29–38

    Article  MathSciNet  Google Scholar 

  • Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, Hoboken

    MATH  Google Scholar 

  • de Luna X, Genton MG (2005) Predictive spatio-temporal models for spatially sparse environmental data. Statistica Sinica 15:547–568

    MathSciNet  MATH  Google Scholar 

  • Fan J, Zhang W (1999) Statistical estimation in varying coefficient models. Ann Stat 27(5):1491–1518

    Article  MathSciNet  Google Scholar 

  • Gelfand AE, Kim H-J, Sirmans CF, Banerjee S (2003) Spatial modeling with spatially varying coefficient processes. J Am Stat Assoc 98(462):387–396

    Article  MathSciNet  Google Scholar 

  • Genton M, Johnson C, Potter K, Stenchikov G, Sun Y (2014) Surface boxplots. Stat 3(1):1–11

    Article  MathSciNet  Google Scholar 

  • Hsu N-J, Hung H-L, Chang Y-M (2008) Subset selection for vector autoregressive processes using Lasso. Comput Stat Data Anal 52(7):3645–3657

    Article  MathSciNet  Google Scholar 

  • Huang H-C, Hsu N-J, Theobald DM, Breidt FJ (2010) Spatial Lasso with applications to GIS model selection. J Comput Gr Stat 19(4):963–983

    Article  MathSciNet  Google Scholar 

  • Kastner G, Huber F (2020) Sparse Bayesian vector autoregressions in huge dimensions. J Forecast 39:1142–1165

    Article  MathSciNet  Google Scholar 

  • Katzfuss M, Cressie N (2012) Bayesian hierarchical spatio-temporal smoothing for very large datasets. Environmetrics 23(1):94–107

    Article  MathSciNet  Google Scholar 

  • Kay JE, Deser C, Phillips A, Mai A, Hannay C, Strand G, Arblaster JM, Bates SC, Danabasoglu G, Edwards J, Holland M, Kushner P, Lamarque J-F, Lawrence D, Lindsay K, Middleton A, Munoz E, Neale R, Oleson K, Polvani L, Vertenstein M (2015) The community earth system model (CESM) large ensemble project: a community resource for studying climate change in the presence of internal climate variability. Bull Am Meteorol Soc 96(8):1333–1349

    Article  Google Scholar 

  • Ke ZT, Fan J, Wu Y (2015) Homogeneity pursuit. J Am Stat Assoc 110(509):175–194

    Article  MathSciNet  Google Scholar 

  • Korobilis D, Pettenuzzo D (2019) Adaptive hierarchical priors for high-dimensional vector autoregressions. J Econ 212(1):241–271

    Article  MathSciNet  Google Scholar 

  • Li F, Sang H (2019) Spatial homogeneity pursuit of regression coefficients for large datasets. J Am Stat Assoc 114(527):1050–1062

    Article  MathSciNet  Google Scholar 

  • Lütkepohl H (2005) New introduction to multiple time series analysis. Springer, Berlin

    Book  Google Scholar 

  • Monbet V, Ailliot P (2017) Sparse vector Markov switching autoregressive models. Application to multivariate time series of temperature. Comput Stat Data Anal 108:40–51

    Article  MathSciNet  Google Scholar 

  • Mu J, Wang G, Wang L (2018) Estimation and inference in spatially varying coefficient models. Environmetrics 29(1):e2485

    Article  MathSciNet  Google Scholar 

  • Ngueyep R, Serban N (2015) Large-vector autoregression for multilayer spatially correlated time series. Technometrics 57(2):207–216

    Article  MathSciNet  Google Scholar 

  • R Development Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Rao SS (2008) Statistical analysis of a spatio-temporal model with location-dependent parameters and a test for spatial stationarity. J Time Ser Anal 29(4):673–694

    Article  MathSciNet  Google Scholar 

  • Schweinberger M, Babkin S, Ensor KB (2017) High-dimensional multivariate time series with additional structure. J Comput Gr Stat 26(3):610–622

    Article  MathSciNet  Google Scholar 

  • Shen X, Huang H-C (2010) Grouping pursuit through a regularization solution surface. J Am Stat Assoc 105(490):727–739

    Article  MathSciNet  Google Scholar 

  • Sun Y, Genton MG (2011) Functional boxplots. J Comput Gr Stat 20(2):316–334

    Article  MathSciNet  Google Scholar 

  • Sun Y, Wang HJ, Fuentes M (2016) Fused adaptive Lasso for spatial and temporal quantile function estimation. Technometrics 58(1):127–137

    Article  MathSciNet  Google Scholar 

  • Tagle F, Castruccio S, Crippa P, Genton MG (2019) A non-Gaussian spatio-temporal model for daily wind speeds based on a multivariate skew-\(t\) distribution. J Time Ser Anal 40:312–326

    Article  Google Scholar 

  • Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J Roy Stat Soc B 67(1):91–108

    Article  MathSciNet  Google Scholar 

  • Tibshirani RJ, Taylor J (2011) The solution path of the generalized lasso. Ann Stat 39(3):1335–1371

    Article  MathSciNet  Google Scholar 

  • Viallon V, Lambert-Lacroix S, Höfling H, Picard F (2013) Adaptive generalized fused-Lasso: asymptotic properties and applications. In: HAL preprint, hal-00813281

  • Wikle CK, Berliner LM, Cressie N (1998) Hierarchical Bayesian space-time models. Environ Ecol Stat 5(2):117–154

    Article  Google Scholar 

  • Wikle CK, Milliff RF, Nychka D, Berliner LM (2001) Spatiotemporal hierarchical bayesian modeling: tropical ocean surface winds. J Am Stat Assoc 96(454):382–397

    Article  MathSciNet  Google Scholar 

  • Yan Y, Genton MG (2019) Non-Gaussian autoregressive processes with Tukey \(g\)-and-\(h\) transformations. Environmetrics 30:e2503

    Article  Google Scholar 

  • Zhao Y, Bondell H (2020) Solution paths for the generalized lasso with applications to spatially varying coefficients regression. Comput Stat Data Anal 142:106821

    Article  MathSciNet  Google Scholar 

  • Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan Yan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No: OSR-2018-CRG7-3742 and Academia Sinica Investigator Award AS-IA-109-M05.

A Appendix

A Appendix

Derivation of (5):

$$\begin{aligned} l(\varvec{\alpha })=-\frac{1}{2}\sum _{t=2}^T \varvec{Z}_{t-1}'{\mathbf {A}}'\varvec{\Psi }^{-1}{\mathbf {A}}\varvec{Z}_{t-1}+\sum _{t=2}^T \varvec{Z}'_t\varvec{\Psi }^{-1}{\mathbf {A}}\varvec{Z}_{t-1}+{\mathrm {constant}}. \end{aligned}$$

For the two terms involved in the above equation,

$$\begin{aligned} \sum _{t=2}^T \varvec{Z}'_t\varvec{\Psi }^{-1}{\mathbf {A}}\varvec{Z}_{t-1} =&~ {\mathrm {vec}}\left( \varvec{\Psi }^{-1}\sum _{t=2}^{T}\varvec{Z}_{t}\varvec{Z}_{t-1}'\right) '{\mathrm {vec}}({\mathbf {A}})={\mathrm {vec}}\left( \varvec{\Psi }^{-1}\sum _{t=2}^{T}\varvec{Z}_{t}\varvec{Z}_{t-1}'\right) '{\mathbf {P}}\varvec{\alpha }, \text {and}\\ \sum _{t=2}^T \varvec{Z}_{t-1}'{\mathbf {A}}'\varvec{\Psi }^{-1}{\mathbf {A}}\varvec{Z}_{t-1} =&~ {\mathrm {vec}}({\mathbf {A}})'\left\{ \left( \sum _{t=1}^{T-1}\varvec{Z}_{t}\varvec{Z}_{t}'\right) \otimes \varvec{\Psi }^{-1}\right\} {\mathrm {vec}}({\mathbf {A}})=\varvec{\alpha }'{\mathbf {P}}'\left\{ \left( \sum _{t=1}^{T-1}\varvec{Z}_{t}\varvec{Z}_{t}'\right) \otimes \varvec{\Psi }^{-1}\right\} {\mathbf {P}}\varvec{\alpha }. \end{aligned}$$

Then, it is easy to see that the log-likelihood is a quadratic form of \(\varvec{\alpha }\), i.e.,

$$\begin{aligned} l(\varvec{\alpha })=-\frac{1}{2}\big \Vert \varvec{y}-\varvec{X}\varvec{\alpha }\big \Vert _2^2+{\mathrm {constant}}, \end{aligned}$$

where \(\varvec{X}\) and \(\varvec{y}\) satisfy (3) and (4). \(\square \)

Asymptotic Properties:

Some notations are needed to state the asymptotic properties. We consider an undirected graph \({\mathcal {G}}=({\mathcal {V}},{\mathcal {E}})\), where \({\mathcal {V}}=\{1,\ldots ,m\}\) is a set of vertices corresponding to the \(m=K n_I + n_B\) parameters in \(\varvec{\alpha }\) to be estimated, and \({\mathcal {E}}\subset {\mathcal {V}}\times {\mathcal {V}}\) is a set of undirected edges that depends on the structure of the spatial grid. Here \({\mathcal {V}}\) can be partitioned into \(n_B+K\) disjoint components, including \(n_B\) isolated nodes (corresponding to the non-regularized coefficients in \(\varvec{\alpha }_\text {B}\)) and K identical disjoint components each with \(n_I\) vertices and the same inner grid structure (corresponding to the penalized coefficients in \(\varvec{\alpha }_k\), for \(k=1,\ldots ,K\)). With these notations, the penalty term in (6) can be rewritten as

$$\begin{aligned} \lambda \sum _{k=1}^K\sum _{{i}\sim {j}}w_{k,{i},{j}}|\alpha _k(\mathbf {s}_{i})-\alpha _k(\mathbf {s}_{j})|=\lambda \sum _{(i,j)\in {\mathcal {E}}}w_{{i},{j}}|\alpha _i-\alpha _j|, \end{aligned}$$

where \(w_{{i},{j}}=|{\tilde{\alpha }}_i-{\tilde{\alpha }}_j|^{-\gamma }\).

We let \(\varvec{\alpha }^0\) be the true parameter vector and define \(\varvec{\xi }=\varvec{y}-\varvec{X}\varvec{\alpha }^0\). We introduce \({\mathcal {A}}=\left\{ (i,j) \in {\mathcal {E}}: \alpha ^0_i=\alpha ^0_j, i,j=1,\ldots ,n_I \right\} \) and consider the subgraph \({\mathcal {G}}_{{\mathcal {A}}}=({\mathcal {V}},{\mathcal {A}})\) of \({\mathcal {G}}\). We denote by \(m_0\) the number of its connected components, that is, the number of distinct values in \(\varvec{\alpha }^0\) supported by \({\mathcal {G}}\). We further denote by \({\mathcal {V}}_1,\ldots , {\mathcal {V}}_{m_0}\) the sets of nodes of each connected components of \({\mathcal {G}}_{{\mathcal {A}}}\) and set \(l_i=\min {{\mathcal {V}}_i}\) for \(i=1,\ldots ,m_0\). We define \(\varvec{\alpha }_{{\mathcal {A}}}^0=(\alpha ^0_{l_1},\ldots ,\alpha ^0_{l_{m_0}})'\) and \(\varvec{X}_{\mathcal {A}}\) a matrix whose i-th column, for \(i=1,\ldots ,m_0\), is \(\varvec{X}_{{\mathcal {A}}_i}=\sum _{j\in {\mathcal {V}}_i}\varvec{X}_j\), where \(\varvec{X}_j\) is the j-th column of \(\varvec{X}\). We assume that \(\varvec{X}'\varvec{X}/(T-1)\rightarrow \mathbf {C}\) as \(T\rightarrow \infty \) for some positive-definite \(m\times m\) matrix \(\mathbf {C}\), which depends on \(\varvec{\alpha }^0\) and \(\varvec{\Psi }\). We denote by \(\mathbf {C}_{{\mathcal {A}}}\) the limiting \(m_0\times m_0\) matrix of \(\varvec{X}_{{\mathcal {A}}}'\varvec{X}_{{\mathcal {A}}}/(T-1)\) as \(T\rightarrow \infty \).

We derive the asymptotic properties of the adaptive Lasso estimator \(\hat{\varvec{\alpha }}\) from solving (7). We let \({\mathcal {A}}_n=\left\{ (i,j) \in {\mathcal {E}}: {\hat{\alpha }}_i={\hat{\alpha }}_j,i,j=1,\ldots ,n_I \right\} \) and \(\hat{\varvec{\alpha }}_{{\mathcal {A}}}=({\hat{\alpha }}_{l_1},\ldots ,{\hat{\alpha }}_{l_{m_0}})'\).

Theorem 1

Suppose that \(\lambda /\sqrt{T}\rightarrow 0,\, \lambda T^{(\gamma -1)/2}\rightarrow \infty \), and \(\mathbf {C}=\lim _{T\rightarrow \infty }\) \(\varvec{X}'\varvec{X}/(T-1)\) is positive-definite. Then as \(T\rightarrow \infty \), the following are satisfied by the adaptive Lasso estimator \(\hat{\varvec{\alpha }}\):

  1. 1.

      Consistency in grouping: \(\Pr ({\mathcal {A}}_n={\mathcal {A}})\rightarrow 1\).

  2. 2.

      Asymptotic normality: \(\sqrt{T}\big (\hat{\varvec{\alpha }}_{{\mathcal {A}}}-\varvec{\alpha }^0_{{\mathcal {A}}}\big )\xrightarrow {d}{\mathcal {N}}_{m_0}(\varvec{0},\mathbf {C}_{{\mathcal {A}}}^{-1})\).

Proof

The following proof is obtained by adapting the Proof of Theorem 2 in Zou (2006) and Theorem 3 in Viallon et al. (2013).

First, it is known that \(\hat{\varvec{\Psi }}\) in (8) is a consistent estimator of \(\varvec{\Psi }\), as T goes to infinity; see Chapter 5 of Lütkepohl (2005). We define \(V_T(\mathbf {u})=F(\varvec{\alpha }^0)-F(\varvec{\alpha }^0+\mathbf {u}/\sqrt{T})\), with F defined in (6). It is obvious that \(V_T(\mathbf {u})\) is minimized at \(\sqrt{T}(\hat{\varvec{\alpha }}-\varvec{\alpha }^0)\) and

$$\begin{aligned} V_T(\mathbf {u})=&\,\mathbf {u}'\left( \frac{1}{2T}\varvec{X}'\varvec{X}\right) \mathbf {u}-\frac{\varvec{\xi }'\varvec{X}}{\sqrt{T}}\mathbf {u}+\frac{\lambda }{\sqrt{T}}\sum _{(i,j)\in {\mathcal {A}}} w_{{i},{j}}\ \sqrt{T}\left( \left| \alpha _i^0-\alpha _j^0+\frac{u_i-u_j}{\sqrt{T}}\right| -|\alpha _i^0-\alpha _j^0|\right) . \end{aligned}$$

We have

$$\begin{aligned} \lambda \ w_{{i},{j}}\left| \alpha _i^0-\alpha _j^0+\frac{u_i-u_j}{\sqrt{T}}\right| -|\alpha _i^0-\alpha _j^0|\xrightarrow {p}{\left\{ \begin{array}{ll}0, &{}\text {if }\alpha _i^0\ne \alpha _j^0 \text { or } (\alpha _i^0=\alpha _j^0 \text { and } u_i=u_j), \\ \infty , &{}\text {otherwise}.\end{array}\right. } \end{aligned}$$

We denote \(\varvec{\mathbf {u}}_{{\mathcal {A}}}=(\mathbf {u}_{l_1},\ldots ,\mathbf {u}_{l_{m_0}})'\) and then, with an application of the Martingale difference central limit theory to \(\varvec{\xi }'\varvec{X}\), we obtain

$$\begin{aligned} V_T(\mathbf {u})\xrightarrow {d}V(\mathbf {u})={\left\{ \begin{array}{ll}\frac{1}{2}\mathbf {u}_{\mathcal {A}}'\mathbf {C}_{{\mathcal {A}}} \mathbf {u}_{\mathcal {A}}-\mathbf {u}_{\mathcal {A}}'\mathbf {W}_{\mathcal {A}}, &{}\text {if } u_i=u_j \text { for } (i,j)\in {\mathcal {A}},\\ \infty , &{}\text {otherwise},\end{array}\right. } \end{aligned}$$

for \(\mathbf {u}\in {\mathbb {R}}^m\), where \({\mathbf{W }}_{\mathcal {A}}\sim {\mathcal {N}}_m(\varvec{0},{\mathbf{C }}_{{\mathcal {A}}})\); V is convex and has a unique minimum satisfying \(u_i=u_j\) for all \((i,j)\in {\mathcal {A}}\) and \(\mathbf {u}_{\mathcal {A}}=\mathbf {C}_{{\mathcal {A}}}^{-1}\mathbf {W}_{\mathcal {A}}\). The asymptotic normality part can be derived as in Zou (2006) by using the epi-convergence results.

For consistency in grouping, we need to show that for all \((i,j)\not \in {\mathcal {A}}, \Pr ((i,j)\in {\mathcal {A}}_n^c)\rightarrow 1\) and for all \((i,j) \in {\mathcal {A}}, \Pr ((i,j)\in {\mathcal {A}}_n^c)\rightarrow 0\). The first part is implied by the asymptotic normality result. To prove the second part, we apply the subgradient equations for the optimality condition, for \(i=1,\ldots ,m\):

$$\begin{aligned} \varvec{X}_i'(\varvec{y}-\varvec{X}\hat{\varvec{\alpha }})=\lambda \sum _{i:(i,j)\in {{\mathcal {E}}}}w_{{i},{j}}t_{ij}, \end{aligned}$$

where \(t_{ij}={{\,\mathrm{sign}\,}}({\hat{\alpha }}_i-{\hat{\alpha }}_j)\) if \({\hat{\alpha }}_i\ne {\hat{\alpha }}_j\) and \(t_{ij}\) is some real number in \([-1,1]\) if \({\hat{\alpha }}_i= {\hat{\alpha }}_j.\) We prove by contradiction. Suppose that for \({\mathcal {V}}_k\) that contains at least two vertices, there exist \(i,j \in {\mathcal {V}}_k\) such that \({\hat{\alpha }}_i\ne {\hat{\alpha }}_j\). We define \(\displaystyle a^{\min }=\min _{i \in {\mathcal {V}}_k} {\hat{\alpha }}_i\) and \({\mathcal {V}}^{\min }=\left\{ i: i \in {\mathcal {V}}_k \text { and } {\hat{\alpha }}_i=a^{\min }\right\} \). Summing up the optimality conditions over the indices in \({\mathcal {V}}^{\min }\), we get:

$$\begin{aligned} \sum _{i \in {\mathcal {V}}^{\min }}\frac{\varvec{X}_i'(\varvec{y}-\varvec{X}\hat{\varvec{\alpha }})}{\sqrt{T}}=\frac{\lambda }{\sqrt{T}}\sum _{i \in {\mathcal {V}}^{\min }} \sum _{\begin{array}{c} i: (i,j)\in {{\mathcal {E}}}\\ \alpha _i^0 \ne \alpha _j^0 \end{array}}\frac{t_{ij}}{|{\tilde{\alpha }}_i-{\tilde{\alpha }}_j|^\gamma }+ \lambda T^{(\gamma -1)/2} \sum _{i \in {\mathcal {V}}^{\min }} \sum _{\begin{array}{c} i: (i,j)\in {{\mathcal {E}}}\\ \alpha _i^0 = \alpha _j^0\\ {\hat{\alpha }}_j>a^{\min } \end{array}}\frac{t_{ij}}{|\sqrt{T}({\tilde{\alpha }}_i-{\tilde{\alpha }}_j)|^\gamma }, \end{aligned}$$

where in the right-hand side, the first sum converges to 0 in probability, while the second sum tends to \(-\infty \). However, the left-hand side is \(O_{p}(1)\), since it can be decomposed as the sum of two asymptotically normal variables as in Zou (2006) with an application of the martingale central limit theorem. Therefore, \(\Pr ((i,j)\in {\mathcal {A}}_n^c)\rightarrow 0\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, Y., Huang, HC. & Genton, M.G. Vector Autoregressive Models with Spatially Structured Coefficients for Time Series on a Spatial Grid. JABES 26, 387–408 (2021). https://doi.org/10.1007/s13253-021-00444-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-021-00444-4

Keywords

Navigation