×

Identifying latent groups in spatial panel data using a Markov random field constrained product partition model. (English) Zbl 07764885

Summary: Understanding the heterogeneity over spatial locations is an important problem that has been widely studied in applications such as economics and environmental science. We focus on regression models for spatial panel data analyses, where repeated measurements are collected over time at various spatial locations. We propose a novel class of nonparametric priors that combines a Markov random field (MRF) with the product partition model (PPM), and show that the resulting prior, called MRF-PPM, is capable of identifying latent group structures among the spatial locations, while efficiently using the spatial dependence information. We derive a closed-form conditional distribution for the proposed prior and introduce a new way of computing the marginal likelihood that renders an efficient Bayesian inference. Furthermore, we study the theoretical properties of the proposed MRF-PPM prior and show a clustering consistency result for the posterior distribution. We demonstrate the excellent empirical performance of our method using extensive simulation studies and applications to US precipitation data and a California median household income data study.

MSC:

62-XX Statistics

References:

[1] Basu, S. and Chib, S. (2003). Marginal likelihood and Bayes factors for Dirichlet process mixture models. Journal of the American Statistical Association 98, 224-235. · Zbl 1047.62023
[2] Belotti, F., Hughes, G. and Mortari, A. P. (2017). Spatial panel-data models using stata. The Stata Journal 17, 139-180.
[3] Blake, A., Kohli, P. and Rother, C. (2011). Markov Random Fields for Vision and Image Pro-cessing. Mit Press, Cambridge, Massachusetts. · Zbl 1236.68001
[4] Bonhomme, S. and Manresa, E. (2015). Grouped patterns of heterogeneity in panel data. Econo-metrica 83, 1147-1184. · Zbl 1410.62100
[5] Browning, M. and Carro, J. (2007). Heterogeneity and microeconometrics modeling. Economet-ric Society Monographs 43, 47. · Zbl 1151.91697
[6] Chib, S. and Kuffner, T. A. (2016). Bayes factor consistency. arXiv preprint arXiv:1607.00292.
[7] Dahl, D. B. (2006). Model-based clustering for expression data via a Dirichlet process mixture model. Bayesian Inference for Gene Expression and Proteomics 4, 201-218.
[8] De Finetti, B. (1929). Funzione caratteristica di un fenomeno aleatorio. In Atti del Congresso In-ternazionale dei Matematici: Bologna del 3 al 10 de settembre di 1928, 179-190. Zanichelli, Bologna. · JFM 58.0544.01
[9] Durrett, R. (2019). Probability: Theory and Examples. Cambridge University Press, Cambridge. · Zbl 1440.60001
[10] Elhorst, J. P. (2014). Spatial Econometrics: From Cross-Sectional Data to Spatial Panels. Springer, New York. · Zbl 1408.62010
[11] Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-6, 721-741. · Zbl 0573.62030
[12] Geng, L. and Hu, G. (2021). Bayesian spatial homogeneity pursuit for survival data with an application to the seer respiratory cancer data. Biometrics 78, 536-547. · Zbl 1520.62209
[13] Green, P. J. and Richardson, S. (2001). Modelling heterogeneity with and without the Dirichlet process. Scandinavian Journal of Statistics 28, 355-375. · Zbl 0973.62031
[14] Hao, Y., Chen, H., Wei, Y.-M. and Li, Y.-M. (2016). The influence of climate change on CO2 (carbon dioxide) emissions: An empirical estimation based on Chinese provincial panel data. Journal of Cleaner Production 131, 667-677.
[15] Hartigan, J. A. (1990). Partition models. Communications in Statistics-Theory and Methods 19, 2745-2756.
[16] Hsiao, C. (2014). Analysis of Panel Data. Cambridge University Press, Cambridge. · Zbl 1320.62003
[17] Hsiao, C. and Tahmiscioglu, A. K. (1997). A panel analysis of liquidity constraints and firm investment. Journal of the American Statistical Association 92, 455-465. · Zbl 0890.62090
[18] Hu, G., Geng, J., Xue, Y. and Sang, H. (2023). Bayesian spatial homogeneity pursuit of func-tional data: An application to the US income distribution. Bayesian Analysis 18, 579-605. · Zbl 1531.62150
[19] Hu, G., Xue, Y. and Ma, Z. (2021). Bayesian clustered coefficients regression with auxiliary covariates assistant random effects. Statistical Modelling 23, 273-293.
[20] Lenk, P. (2009). Simulation pseudo-bias correction to the harmonic mean estimator of integrated likelihoods. Journal of Computational and Graphical Statistics 18, 941-960.
[21] Lewis, P. O., Xie, W., Chen, M.-H., Fan, Y. and Kuo, L. (2014). Posterior predictive Bayesian phylogenetic model selection. Systematic Biology 63, 309-321.
[22] Lin, C.-C. and Ng, S. (2012). Estimation of panel data models with parameter heterogeneity when group membership is unknown. Journal of Econometric Methods 1, 42-55. · Zbl 1279.62224
[23] Ma, Z., Xue, Y. and Hu, G. (2020). Heterogeneous regression models for clusters of spatial dependent data. Spatial Economic Analysis 15, 459-475.
[24] Miao, K., Su, L. and Wang, W. (2020). Panel threshold regressions with latent group structures. Journal of Econometrics 214, 451-481. · Zbl 1456.62300
[25] Miller, J. W. and Harrison, M. T. (2018). Mixture models with a prior on the number of components. Journal of the American Statistical Association 113, 340-356. · Zbl 1398.62066
[26] Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9, 249-265.
[27] Newton, M. A. and Raftery, A. E. (1994). Approximate Bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society. Series B (Methodological) 56, 3-26. · Zbl 0788.62026
[28] Orbanz, P. and Buhmann, J. M. (2008). Nonparametric Bayesian image segmentation. Interna-tional Journal of Computer Vision 77, 25-45. · Zbl 1477.68518
[29] Page, G. L. and Quintana, F. A. (2015). Predictions based on the clustering of heterogeneous functions via shape and subject-specific covariates. Bayesian Analysis 10, 379-410. · Zbl 1336.62251
[30] Page, G. L. and Quintana, F. A. (2016). Spatial product partition models. Bayesian Analysis 11, 265-298. · Zbl 1359.62401
[31] Park, J.-H. and Dunson, D. B. (2010). Bayesian generalized product partition model. Statistica Sinica 20, 1203-1226. · Zbl 1507.62242
[32] Parsons, B. and Daly, S. (1983). The relationship between surface topography, gravity anomalies, and temperature structure of convection. Journal of Geophysical Research: Solid Earth 88, 1129-1144.
[33] Pesaran, M. H. (2015). Time Series and Panel Data Econometrics. Oxford University Press, Oxford. · Zbl 1336.91002
[34] Pitman, J. (2002). Combinatorial Stochastic Processes. Technical Report 621, Dept. Statistics, UC Berkeley, Berkeley.
[35] Quintana, F. A. and Iglesias, P. L. (2003). Bayesian clustering and product partition models. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 65, 557-574. · Zbl 1065.62115
[36] Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846-850.
[37] Su, L. and Chen, Q. (2013). Testing homogeneity in panel data models with interactive fixed effects. Econometric Theory 29, 1079-1135. · Zbl 1290.62088
[38] Su, L., Shi, Z. and Phillips, P. C. (2016). Identifying latent structures in panel data. Economet-rica 84, 2215-2264. · Zbl 1410.62110
[39] Su, L., Wang, X. and Jin, S. (2019). Sieve estimation of time-varying panel data models with latent structures. Journal of Business & Economic Statistics 37, 334-349.
[40] Teixeira, L. V., Assunção, R. M. and Loschi, R. H. (2019). Bayesian space-time partitioning by sampling and pruning spanning trees. Journal of Machine Learning Research 20, 1-35. · Zbl 1441.62175
[41] Wagner, C. H. (1982). Simpson’s paradox in real life. The American Statistician 36, 46-48.
[42] Yourdon, E. and Constantine, L. L. (1978). Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design. Yourdon Press. · Zbl 0466.68003
[43] Zhang, B. (2020). Forecasting with Bayesian Grouped random effects in panel data. arXiv preprint arXiv:2007.02435.
[44] Zhao, P., Yang, H.-C., Dey, D. K. and Hu, G. (2020). Bayesian spatial homogeneity pursuit regression for count value data. arXiv preprint arXiv:2002.06678.
[45] Weining Shen Department of Statistics, , University of California, Irvine, Irvine, CA 92697, USA. E-mail: weinings@uci.edu (Received July 2021; accepted January 2022)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.