×

Thermodynamic integration and steppingstone sampling methods for estimating Bayes factors: a tutorial. (English) Zbl 1431.62521

Summary: One of the more principled methods of performing model selection is via Bayes factors. However, calculating Bayes factors requires marginal likelihoods, which are integrals over the entire parameter space, making estimation of Bayes factors for models with more than a few parameters a significant computational challenge. Here, we provide a tutorial review of two Monte Carlo techniques rarely used in psychology that efficiently compute marginal likelihoods: thermodynamic integration [N. Friel and A. N. Pettitt, J. R. Stat. Soc., Ser. B, Stat. Methodol. 70, No. 3, 589–607 (2008; Zbl 05563360); N. Lartillot and H. Philippe, “Computing Bayes factors using thermodynamic integration”, Syst. Biol. 55, No. 2, 195–207 (2006; doi:10.1080/10635150500433722)] and steppingstone sampling [W. Xie et al., “Improving marginal likelihood estimation for Bayesian phylogenetic model selection”, Syst. Biol. 60, No. 2, 150–160 (2011; doi:10.1093/sysbio/syq085)]. The methods are general and can be easily implemented in existing MCMC code; we provide both the details for implementation and associated R code for the interested reader. While Bayesian toolkits implementing standard statistical analyses (e.g., JASP Team, 2017; R. D. Morey and J. N. Rouder, “BayesFactor: computation of Bayes factors for common designs” (2015; https://cran.r-project.org/package=BayesFactor)] often compute Bayes factors for the researcher, those using Bayesian approaches to evaluate cognitive models are usually left to compute Bayes factors for themselves. Here, we provide examples of the methods by computing marginal likelihoods for a moderately complex model of choice response time, the Linear Ballistic Accumulator model [S. D. Brown and A. Heathcote, “The simplest complete model of choice response time: linear ballistic accumulation ”, Cogn. Psychol. 57, No. 3 (2008; doi:10.1016/j.cogpsych.2007.12.002)], and compare them to findings of [N. J. Evans and S. D. Brown, “Bayes factors for the Linear Ballistic Accumulator Model of decision-making”, Behav. Res. Method 50, No. 2, 589–603 (2017; doi:10.3758/s13428-017-0887-5)], who used a brute force technique. We then present a derivation of TI and SS within a hierarchical framework, provide results of a model recovery case study using hierarchical models, and show an application to empirical data. A companion R package is available at the Open Science Framework: https://osf.io/jpnb4.

MSC:

62P15 Applications of statistics to psychology
62F15 Bayesian inference
65C05 Monte Carlo methods

Citations:

Zbl 05563360

References:

[1] Annis, J.; Palmeri, T. J., Bayesian statistical approaches to evaluating cognitive models, Wiley Interdisciplinary Reviews: Cognitive Science, 9, April, Article e1458 pp. (2017)
[2] Brooks, S.; Gelman, A.; Jones, G.; Meng, X.-L., Handbook of Markov chain Monte Carlo (2011), Chapman & Hall/CRC Press: Chapman & Hall/CRC Press Boca Raton · Zbl 1218.65001
[3] Brown, S. D.; Heathcote, A., The simplest complete model of choice response time: Linear ballistic accumulation, Cognitive Psychology, 57, 3, 153-178 (2008)
[4] Busemeyer, J. R.; Diederich, A., Cognitive modeling (2010), Sage Publications.: Sage Publications. Thousand Oaks, CA
[5] Carlin, B. P.; Chib, S., Bayesian model choice via Markov Chain Monte Carlo methods, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 57, 3, 473-484 (1995) · Zbl 0827.62027
[6] Chib, S., Marginal likelihood from the Gibbs output, Journal of the American Statistical Association, 90, 432, 1313-1321 (1995) · Zbl 0868.62027
[7] Chib, S.; Jeliazkov, I., Marginal likelihood from the Metropolis-Hastings output, Journal of the American Statistical Association, 96, 453, 270-281 (2001) · Zbl 1015.62020
[8] Evans, N. J.; Brown, S. D., Bayes factors for the Linear Ballistic Accumulator Model of decision-making, (Behavior research methods (2017), Advance Online Publication)
[9] Evans, N. J.; Howard, Z. L.; Heathcote, A.; Brown, S. D., Model Flexibility Analysis does not measure the persuasiveness of a fit, Psychological Review, 124, 3, 339-345 (2017)
[10] Friel, N.; Hurn, M.; Wyse, J., Improving power posterior estimation of statistical evidence, Statistics and Computing, 24, 5, 709-723 (2014) · Zbl 1322.62098
[11] Friel, N.; Pettitt, A. N., Marginal likelihood estimation via power posteriors, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 70, 3, 589-607 (2008) · Zbl 05563360
[12] Friel, N.; Wyse, J., Estimating the evidence - A review, Statistica Neerlandica, 66, 3, 288-308 (2012)
[13] Gershman, S. J., Empirical priors for reinforcement learning models, Journal of Mathematical Psychology, 71, 1-6 (2016) · Zbl 1359.62500
[14] Gronau, Q. F.; Sarafoglou, A.; Matzke, D.; Ly, A.; Boehm, U.; Marsman, M., A tutorial on bridge sampling, Journal of Mathematical Psychology, 81, 80-97 (2017) · Zbl 1402.62042
[15] Gronau, Q. F.; Singmann, H.; Wagenmakers, E.-J., Bridgesampling: Bridge sampling for marginal likelihoods and Bayes factors (2017), Retrieved from https://github.com/quentingronau/bridgesampling
[16] Grünwald, P. D.; Myung, I. J.; Pitt, M. A., Advances in minimum description length: Theory and applications (2005), MIT Press: MIT Press London, England
[17] Hastings, W. K., Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57, 97-109 (1970) · Zbl 0219.65008
[18] Höhna, S., Landis, M. L., & Huelsenbeck, J. P. (2017). Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics, bioRxiv. http://doi.org/10.1101/104422; Höhna, S., Landis, M. L., & Huelsenbeck, J. P. (2017). Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics, bioRxiv. http://doi.org/10.1101/104422
[19] Hug, S.; Schwarzfischer, M.; Hasenauer, J.; Marr, C.; Theis, F. J., An adaptive scheduling scheme for calculating Bayes factors with thermodynamic integration using Simpson’s rule, Statistics and Computing, 26, 3, 663-677 (2016) · Zbl 1505.62196
[20] (2017), JASP. Retrieved from https://jasp-stats.org/
[21] Jeffreys, H., Theory of probability (1961), Oxford University Press · Zbl 0116.34904
[22] Kass, R. E.; Raftery, A. E., Bayes factors, Journal of the American Statistical Association, 90, 430, 773-795 (1995) · Zbl 0846.62028
[23] Lartillot, N.; Philippe, H., Computing Bayes factors using thermodynamic integration, Systematic Biology, 55, 2, 195-207 (2006)
[24] Lee, M. D., Determining the dimensionality of multidimensional scaling representations for cognitive modeling, Journal of Mathematical Psychology, 45, 1, 149-166 (2001) · Zbl 1003.91033
[25] Lee, M. D., A Bayesian analysis of retention functions, Journal of Mathematical Psychology, 48, 5, 310-321 (2004) · Zbl 1118.91360
[26] Lee, M. D.; Vanpaemel, W., Determining informative priors for cognitive models, Psychonomic Bulletin & Review (2017)
[27] Lewandowsky, S.; Farrel, S., Computational modeling in cognition: Principles and practice (2011), Sage Publications: Sage Publications Thousand Oaks, CA
[28] Liu, C. C.; Aitkin, M., Bayes factors: Prior sensitivity and model generalizability, Journal of Mathematical Psychology, 52, 6, 362-375 (2008) · Zbl 1152.91771
[29] Liu, P.; Elshall, A. S.; Ye, M.; Beerli, P.; Zeng, X.; Lu, D., Evaluating marginal likelihood with thermodynamic integration method and comparison with several other numerical methods, Water Resources Research, 52, 734-758 (2016)
[30] Lodewyckx, T.; Kim, W.; Lee, M. D.; Tuerlinckx, F.; Kuppens, P.; Wagenmakers, E.-J., A tutorial on Bayes factor estimation with the product space method, Journal of Mathematical Psychology, 55, 5, 331-347 (2011) · Zbl 1225.62037
[31] Ly, A.; Marsman, M.; Verhagen, J.; Grasman, R. P.P. P.; Wagenmakers, E. J., A tutorial on Fisher information, Journal of Mathematical Psychology, 80, 40-55 (2017) · Zbl 1402.62318
[32] Meng, X.-L.; Wong, H. W., Simulating ratios of normalizing constants via a simple identity: A theoretical exploration, Statistica Sinica, 6, 831-860 (1996) · Zbl 0857.62017
[33] Morey, R. D.; Rouder, J. N., BayesFactor: Computation of Bayes factors for common designs (2015), Retrieved from https://cran.r-project.org/package=BayesFactor
[34] Myung, I. J., The Importance of complexity in model selection, Journal of Mathematical Psychology, 44, 1, 190-204 (2000) · Zbl 0946.62094
[35] Myung, I. J.; Navarro, D. J.; Pitt, M. A., Model selection by normalized maximum likelihood, Journal of Mathematical Psychology, 50, 2, 167-179 (2006) · Zbl 1100.94008
[36] Myung, I. J.; Pitt, M. A., Applying Occam’s razor in modeling cognition: A Bayesian approach, Psychonomic Bulletin & Review, 4, 1, 79-95 (1997)
[37] Neal, R. M., Annealed importance sampling, Statistics and Computing, 11, 2, 125-139 (2001)
[38] Newton, M.; Raftery, A., Approximate Bayesian inference with the weighted likelihood bootstrap, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 56, 1, 3-48 (1994) · Zbl 0788.62026
[39] Oates, C. J.; Papamarkou, T.; Girolami, M., The controlled thermodynamic integral for Bayesian model evidence evaluation, Journal of the American Statistical Association, 111, 514, 634-645 (2016)
[40] Ogata, Y., A Monte Carlo method for high dimensional integration, Numerische Mathematik, 55, 2, 137-157 (1989) · Zbl 0669.65011
[41] R Core Team. (2017) R: A language and environment for statistical computing. Vienna, Austria. Retrieved from https://www.r-project.org/; R Core Team. (2017) R: A language and environment for statistical computing. Vienna, Austria. Retrieved from https://www.r-project.org/
[42] Rae, B.; Heathcote, A.; Donkin, C.; Averell, L.; Brown, S., The Hare and the Tortoise: Emphasizing speed can change the evidence used to make decisions, Journal of Experimental Psychology. Learning, Memory, and Cognition, 40, 5, 1226-1243 (2014)
[43] Ratcliff, R.; Smith, P. L., A comparison of sequential sampling models for two-choice reaction time, Psychological Review, 111, 2, 333-367 (2004)
[44] Rouder, J. N.; Morey, R. D., Default Bayes factors for model selection in regression, Multivariate Behavioral Research, 47, 6, 877-903 (2012)
[45] Rouder, J. N.; Morey, R. D.; Speckman, P. L.; Province, J. M., Default Bayes factors for ANOVA designs, Journal of Mathematical Psychology, 56, 5, 356-374 (2012) · Zbl 1282.62167
[46] Shiffrin, R. M.; Lee, M. D.; Kim, W.; Wagenmakers, E.-J., A survey of model evaluation approaches with a tutorial on hierarchical bayesian methods, Cognitive Science, 32, 8, 1248-1284 (2008)
[47] Skilling, J., Nested sampling for Bayesian computations, Bayesian Analysis, 1, 4, 833-860 (2006) · Zbl 1332.62374
[48] Turner, B. M.; Sederberg, P. B.; Brown, S. D.; Steyvers, M., A method for efficiently sampling from distributions with correlated dimensions, Psychological Methods, 18, 3, 368-384 (2013)
[49] Vanpaemel, W., Prior sensitivity in theory testing: An apologia for the Bayes factor, Journal of Mathematical Psychology, 54, 6, 491-498 (2010) · Zbl 1203.91265
[50] Vanpaemel, W., Constructing informative model priors using hierarchical methods, Journal of Mathematical Psychology, 55, 1, 106-117 (2011) · Zbl 1208.62195
[51] Vanpaemel, W.; Lee, M. D., Using priors to formalize theory: Optimal attention and the generalized context model, Psychonomic Bulletin & Review, 19, 6, 1047-1056 (2012)
[52] Vanpaemel, W.; Storms, G., Abstraction and model evaluation in category learning, Behavior Research Methods, 42, 2, 421-437 (2010)
[53] Wagenmakers, E.-J.; Lodewyckx, T.; Kuriyal, H.; Grasman, R., Bayesian hypothesis testing for psychologists: A tutorial on the Savage-Dickey method, Cognitive Psychology, 60, 3, 158-189 (2010)
[54] Wasserman, L., Bayesian model selection and model averaging, Journal of Mathematical Psychology, 44, 1, 92-107 (2000) · Zbl 0946.62032
[55] Xie, W.; Lewis, P. O.; Fan, Y.; Kuo, L.; Chen, M. H., Improving marginal likelihood estimation for bayesian phylogenetic model selection, Systematic Biology, 60, 2, 150-160 (2011)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.