×

Anchored Bayesian Gaussian mixture models. (English) Zbl 1452.62453

Summary: Finite mixtures are a flexible modeling tool for irregularly shaped densities and samples from heterogeneous populations. When modeling with mixtures using an exchangeable prior on the component features, the component labels are arbitrary and are indistinguishable in posterior analysis. This makes it impossible to attribute any meaningful interpretation to the marginal posterior distributions of the component features. We propose a model in which a small number of observations are assumed to arise from some of the labeled component densities. The resulting model is not exchangeable, allowing inference on the component features without post-processing. Our method assigns meaning to the component labels at the modeling stage and can be justified as a data-dependent informative prior on the labelings. We show that our method produces interpretable results, often (but not always) similar to those resulting from relabeling algorithms, with the added benefit that the marginal inferences originate directly from a well specified probability model rather than a post hoc manipulation. We provide asymptotic results leading to practical guidelines for model selection that are motivated by maximizing prior information about the class labels and demonstrate our method on real and simulated data.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62P35 Applications of statistics to physics
85A35 Statistical astronomy

References:

[1] Abney, S. (2004). Understanding the Yarowsky Algorithm., Computational Linguistics 30 365-395. · Zbl 1234.68396 · doi:10.1162/0891201041850876
[2] Albert, M. V., Kording, K., Herrmann, M. and Jayaraman, A. (2012). Fall Classification by Machine Learning Using Mobile Phones., PLOS ONE 7 1-6.
[3] Bardenet, R., Cappe, O., Fort, G. and Kegl, B. (2012). Adaptive Metropolis with Online Relabeling. In, Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (N. D. Lawrence and M. Girolami, eds.). Proceedings of Machine Learning Research 22 91-99. PMLR, La Palma, Canary Islands. · Zbl 1386.65011 · doi:10.3150/13-BEJ578
[4] Berk, R. H. (1966). Limiting Behavior of Posterior Distributions when the Model is Incorrect., Annals of Mathematical Statistics 37 51-58. · Zbl 0151.23802 · doi:10.1214/aoms/1177699597
[5] Berkelaar, M. (2015). lpSolve: Interface to ‘Lp_solve’ v. 5.5 to Solve Linear/Integer Programs R package version, 5.6.13.
[6] Bourke, A. K. and Lyons, G. M. (2008). A Threshold-Based Fall-Detection Algorithm Using a Bi-axial Gyroscope Sensor., Medical Engineering & Physics 30 84-90.
[7] Casilari, E., Santoyo-Ramón, J.-A. and Cano-García, J.-M. (2017). Analysis of Public Datasets for Wearable Fall Detection Systems., Sensors 17.
[8] Celeux, G., Hurn, M. and Robert, C. P. (2000). Computational and Inferential Difficulties with Mixture Posterior Distributions., Journal of the American Statistical Association 95 957-970. · Zbl 0999.62020 · doi:10.1080/01621459.2000.10474285
[9] Chung, H., Loken, E. and Schafer, J. L. (2004). Difficulties in Drawing Inferences With Finite-Mixture Models., The American Statistician 58 152-158.
[10] Cooley, C. A. and MacEachern, S. N. (1999). Prior Elicitation in the Classification Problem., The Canadian Journal of Statistics / La Revue Canadienne de Statistique 27 299-313. · Zbl 0941.62073 · doi:10.2307/3315640
[11] Diebolt, J. and Robert, C. P. (1994). Estimation of Finite Mixture Distributions through Bayesian Sampling., Journal of the Royal Statistical Society. Series B (Statistical Methodology) 56 363-375. · Zbl 0796.62028 · doi:10.1111/j.2517-6161.1994.tb01985.x
[12] Egidi, L., Pappadà, R., Pauli, F. and Torelli, N. (2018). Relabelling in Bayesian Mixture Models by Pivotal Units., Statistics and Computing 28 957-969. · Zbl 1384.62194 · doi:10.1007/s11222-017-9774-2
[13] Flegal, J. M., Hughes, J., Vats, D. and Dai, N. (2020). mcmcse: Monte Carlo Standard Errors for MCMC, Riverside, CA, Denver, CO, Coventry, UK, and Minneapolis, MN R package version, 1.4-1.
[14] Frühwirth-Schnatter, S. (2001). Markov Chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models., Journal of the American Statistical Association 96 194-209. · Zbl 1015.62022
[15] Frühwirth-Schnatter, S. (2006)., Finite Mixture and Markov Switching Models. Springer. · Zbl 1108.62002
[16] Ganchev, K., Taskar, B. and Gama, J. (2008). Expectation Maximization and Posterior Constraints. In, Advances in Neural Information Processing Systems 20 (J. C. Platt, D. Koller, Y. Singer and S. T. Roweis, eds.) 569-576. Curran Associates, Inc.
[17] Gelman, A., Hwang, J. and Vehtari, A. (2014). Understanding Predictive Information Criteria for Bayesian Models., Statistics and Computing 24 997-1016. · Zbl 1332.62090 · doi:10.1007/s11222-013-9416-2
[18] Genovese, C. R. and Wasserman, L. (2000). Rates of Convergence for the Gaussian Mixture Sieve., The Annals of Statistics 28 1105-1127. · Zbl 1105.62333 · doi:10.1214/aos/1015956709
[19] Geweke, J. (2007). Interpretation and Inference in Mixture Models: Simple MCMC Works., Computational Statistics & Data Analysis 51 3529-3550. · Zbl 1161.62338 · doi:10.1016/j.csda.2006.11.026
[20] Grazian, C. and Robert, C. P. (2018). Jeffreys Priors for Mixture Estimation: Properties and Alternatives., Computational Statistics & Data Analysis 121 149-163. · Zbl 1469.62070 · doi:10.1016/j.csda.2017.12.005
[21] Jasra, A., Holmes, C. C. and Stephens, D. A. (2005). Markov chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling., Statistical Science 20 50-67. · Zbl 1100.62032 · doi:10.1214/088342305000000016
[22] Juman, Z. A. M. S. and Hoque, M. A. (2015). An Efficient Heuristic to Obtain a Better Initial Feasible Solution to the Transportation Problem., Applied Soft Computing 34 813-826.
[23] Kunkel, D. (2018). Anchored Bayesian Gaussian Mixture Models, PhD thesis, The Ohio State, University. · Zbl 1452.62453 · doi:10.1214/20-EJS1756
[24] Kunkel, D. and Peruggia, M. (2019). Statistical inference with anchored Bayesian mixture of regressions models: A case study analysis of allometric data., arXiv preprint arXiv:1905.04389.
[25] Li, H. and Fan, X. (2016). A Pivotal Allocation-Based Algorithm for Solving the Label-Switching Problem in Bayesian Mixture Models., Journal of Computational and Graphical Statistics 25 266-283.
[26] Lücke, J. (2016). Truncated Variational Expectation Maximization., arXiv preprint arXiv:1610.03113.
[27] Maechler, M. (2019). nor1mix: Normal aka Gaussian (1-d) Mixture Models R package version, 1.3-0.
[28] Marin, J.-M., Mengersen, K. and Robert, C. P. (2005). Bayesian Modelling and Inference on Mixtures of Distributions. In, Bayesian Thinking Modeling and Computation, (D. K. Dey and C. R. Rao, eds.). Handbook of Statistics 25 459-507. Elsevier. · Zbl 1136.62012
[29] Marin, J.-M. and Robert, C. P. (2014)., Bayesian Essentials with R. Springer. · Zbl 1380.62005
[30] Mathirajan, M. and Meenakshi, B. (2004). Experimental Analysis of Some Variants of Vogel’s Approximation Method., Asia-Pacific Journal of Operational Research 21 447-462. · Zbl 1070.90016 · doi:10.1142/S0217595904000333
[31] Neal, R. M. and Hinton, G. E. (1998)., A View of the EM Algorithm that Justifies Incremental, Sparse, and other Variants. In Learning in Graphical Models 355-368. Springer Netherlands, Dordrecht. · Zbl 0916.62019
[32] Norets, A. and Pelenis, J. (2012). Bayesian Modeling of Joint and Conditional Distributions., Journal of Econometrics 168 332-346. · Zbl 1443.62065 · doi:10.1016/j.jeconom.2012.02.001
[33] Papastamoulis, P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs., Journal of Statistical Software, Code Snippets 69 1-24.
[34] Papastamoulis, P. and Iliopoulos, G. (2010). An Artificial Allocations Based Solution to the Label Switching Problem in Bayesian Analysis of Mixtures of Distributions., Journal of Computational and Graphical Statistics 19 313-331.
[35] Richardson, S. and Green, P. J. (1997). On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 59 731-792. · Zbl 0891.62020 · doi:10.1111/1467-9868.00095
[36] Rodriguez, C. E. and Walker, S. G. (2014). Label Switching in Bayesian Mixture Models., Journal of Computational and Graphical Statistics 23 25-45.
[37] Roeder, K. (1990). Density Estimation with Confidence Sets Exemplified by Superclusters and Voids in the Galaxies., Journal of the American Statistical Association 85 617-624. · Zbl 0704.62103 · doi:10.1080/01621459.1990.10474918
[38] Roeder, K. and Wasserman, L. (1997). Practical Bayesian Density Estimation Using Mixtures of Normals., Journal of the American Statistical Association 92 894-902. · Zbl 0889.62021 · doi:10.1080/01621459.1997.10474044
[39] Rossi, P. E. (2014)., Bayesian Non- and Semi-parametric Methods and Applications. Princeton University Press. · Zbl 1306.62020
[40] Stephens, M. (2000). Dealing with Label Switching in Mixture Models., Journal of the Royal Statistical Society. Series B (Statistical Methodology) 62 795-809. · Zbl 0957.62020 · doi:10.1111/1467-9868.00265
[41] Sucerquia, A., López, J. D. and Vargas-Bonilla, J. F. (2017). SisFall: A Fall and Movement Dataset., Sensors 17.
[42] Wasserman, L. (2000). Asymptotic Inference for Mixture Models by Using Data-Dependent Priors., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62 159-180. · Zbl 0976.62028 · doi:10.1111/1467-9868.00226
[43] Yarowsky, D.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.