Abstract
In latent variable models parameter estimation can be implemented by using the joint or the marginal likelihood, based on independence or conditional independence assumptions. The same dilemma occurs within the Bayesian framework with respect to the estimation of the Bayesian marginal (or integrated) likelihood, which is the main tool for model comparison and averaging. In most cases, the Bayesian marginal likelihood is a high dimensional integral that cannot be computed analytically and a plethora of methods based on Monte Carlo integration (MCI) are used for its estimation. In this work, it is shown that the joint MCI approach makes subtle use of the properties of the adopted model, leading to increased error and bias in finite settings. The sources and the components of the error associated with estimators under the two approaches are identified here and provided in exact forms. Additionally, the effect of the sample covariation on the Monte Carlo estimators is examined. In particular, even under independence assumptions the sample covariance will be close to (but not exactly) zero which surprisingly has a severe effect on the estimated values and their variability. To address this problem, an index of the sample’s divergence from independence is introduced as a multivariate extension of covariance. The implications addressed here are important in the majority of practical problems appearing in Bayesian inference of multi-parameter models with analogous structures.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aguilar, O., West, M.: Bayesian dynamic factor models and portfolio allocation. J. Bus. Econ. Stat. 18, 338–357 (2000)
Baker, F.: An investigation of the item parameter recovery characteristics of a Gibbs sampling procedure. Appl. Psychol. Meas. 22, 153–169 (1998)
Bartholomew, D., Knott, M., Moustaki, I.: Latent Variable Models and Factor Analysis: A Unified Approach. Wiley Series on Probability and Statistics, 3rd edn. Wiley, London (2011)
Bock, R., Aitkin, M.: Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika 46, 443–459 (1981)
Bock, R.D., Lieberman, M.: Fitting a response model for n dichotomously scored items. Psychometrika 35, 179–197 (1970)
Bratley, P., Fox, B.L., Schrage, L.: A Guide to Simulation, 2nd edn. Springer, Berlin (1987)
Carlin, B.P., Louis, T.A.: Bayes and Empirical Bayes Methods for Data Analysis, 2nd edn. Chapman & Hall/CRC, London (2000)
Chib, S., Jeliazkov, I.: Marginal likelihood from the Metropolis–Hastings output. J. Am. Stat. Assoc. 96, 270–281 (2001)
Congdon, P.: Applied Bayesian Hierarchical Methods. Chapman and Hall/CRC, London (2010)
DiCiccio, T.J., Kass, R.E., Raftery, A., Wasserman, L.: Computing Bayes factors by combining simulation and asymptotic approximations. J. Am. Stat. Assoc. 92(439), 903–915 (1997)
Flegal, J., Jones, G.: Batch means and spectral variance estimators in markov chain monte carlo. Ann. Stat. 38, 1034–1070 (2010)
Fouskakis, D., Ntzoufras, I., Draper, D.: Bayesian variable selection using cost-adjusted BIC, with application to cost-effective measurement of quality of health care. Ann. Appl. Stat. 3, 663–690 (2009)
Friel, N., Pettitt, N.: Marginal likelihood estimation via power posteriors. J. R. Stat. Soc. Ser. (Stat. Methodol.) 70(3), 589–607 (2008)
Gelfand, A.E., Dey, D.K.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. Ser. B (Methodol.) 56(3), 501–514 (1994)
Gelman, A., Meng, X.-L.: Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat. Sci. 13(2), 163–185 (1998)
Geweke, J., Zhou, G.: Measuring the pricing error of the arbitrage pricing theory. Rev. Financ. Stud. 9, 557–587 (1996)
Gifford, J.A., Swaminathan, H.: Bias and the effect of priors in Bayesian estimation of parameters of item response models. Appl. Psychol. Meas. 14, 33–43 (1990)
Goodman, L.A.: The variance of the product of K random variables. J. Am. Stat. Assoc. 57, 54–60 (1962)
Huber, P., Ronchetti, E., Victoria-Feser, M.-P.: Estimation of generalized linear latent variable models. J. R. Stat. Soc. Ser. B 66, 893–908 (2004)
Jones, G., Haran, M., Caffo, B., Neath, R.: Fixed-width output analysis for Markov chain Monte Carlo. J. Am. Stat. Assoc. 101, 1537–1547 (2006)
Kang, T., Cohen, A.S.: Irt model selection methods for dichotomous items. Appl. Psychol. Meas. 31(4), 331358 (2007)
Kass, R., Raftery, A.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995)
Kim, S.-H., Cohen, A.S., Baker, F.B., Subkoviak, M.J., Leonard, T.: An investigation of hierarchical Bayes procedures in item response theory. Psychometrika 59(3), 405–421 (1994)
Koehler, E., Brown, E., Haneuse, S.J.-P.A.: On the assessment of Monte Carlo error in simulation-based statistical analyses. Am. Stat. 63(2), 155–162 (2009)
Lewis, S., Raftery, A.: Estimating Bayes factors via posterior simulation with the Laplace Metropolis estimator. J. Am. Stat. Assoc. 92, 648–655 (1997)
Lopes, H.F., West, M.: Bayesian model assessment in factor analysis. Stat. Sin. 14, 4167 (2004)
Lord, F.M.: Applications of Item Response Theory to Practical Testing Problems. Erlbaum Associates, Hillsdale (1980)
Lord, F.M., Novick, M.R.: Statistical Theories of Mental Test Scores. Addison-Wesley, Oxford (1968)
Meketon, M.S., Schmeiser, B.W. Overlapping batch means: something for nothing?” In: Proceedings of the 1984 Winter Simulation Conference, pp. 227–230. Institute of Electrical and Electronics Engineers Inc., Piscataway (1984)
Meng, X.-L., Schilling, S.: Warp bridge sampling. J. Comput. Graph. Stat. 11(3), 552–586 (2002)
Meng, X.-L., Wong, W.-H.: Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Stat. Sin. 6, 831–860 (1996)
Mislevy, R.: Bayes modal estimation in item response models. Psychometrika 51, 177–195 (1986)
Moustaki, I., Knott, M.: Generalized latent trait models. Psychometrika 65, 391–411 (2000)
Ntzoufras, I., Dellaportas, P., Forster, J.: Bayesian variable and link determination for generalised linear models. J. Stat. Plan. Inference 111(1–2), 165–180 (2003)
Patz, R., Junker, B.: A straightforward approach to Markov chain Monte Carlo methods for item response models. J. Educ. Behav. Stat. 24, 146–178 (1999)
Rabe-Hesketh, S., Skrondal, A., Pickles, A.: Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. J. Econom. 128, 301–323 (2005)
Schilling, S., Bock, R.: High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika 70, 533–555 (2005)
Schmeiser, B.W.: Batch size effects in the analysis of simulation output. Oper. Res. 30, 556–568 (1982)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The identities of the MCMC estimators used in the Sect. 2.1 are
-
Reciprocal importance (RM) sampling estimator (Gelfand and Dey 1994)
$$\begin{aligned} f(\varvec{Y})= \left[ \int \frac{g(\varvec{\vartheta })}{f(\mathbf Y |\,\varvec{\vartheta })\,\pi (\varvec{\vartheta })} \,\pi (\varvec{\vartheta }|\,\mathbf Y )\,d{\varvec{\vartheta }}\right] ^{-1}, \end{aligned}$$(23) -
Generalized harmonic bridge (BH) sampling estimator (Meng and Wong 1996)
$$\begin{aligned} f(\varvec{Y})=\frac{ \displaystyle \int \left[ \,g(\varvec{\vartheta \underline{}}) \right] ^{-1}g(\varvec{\vartheta })\,d{\varvec{\vartheta }}}{\displaystyle \int \left[ f(\mathbf Y |\,\varvec{\vartheta })\pi (\varvec{\vartheta }) \right] ^{- 1}\pi (\varvec{\vartheta }|\,\mathbf Y )\,d{\varvec{\vartheta }}}\,, \end{aligned}$$(24) -
Geometric bridge (BG) sampling estimator (Meng and Wong 1996)
$$\begin{aligned} f(\mathbf Y )=\frac{ \displaystyle \int \left[ \frac{ f(\mathbf Y |\,\varvec{\vartheta })\pi (\varvec{\vartheta }) }{g(\varvec{\vartheta }) } \right] ^{1/2}g(\varvec{\vartheta })\,d{\varvec{\vartheta }}}{\displaystyle \int \left[ \frac{ f(\mathbf Y |\,\varvec{\vartheta })\pi (\varvec{\vartheta }) }{g(\varvec{\vartheta }) } \right] ^{-1/2}\pi (\varvec{\vartheta }|\,\mathbf Y )\,d{\varvec{\vartheta }}}\,\,. \end{aligned}$$(25)
Proof of Lemma 3.2
According to Goodman (1962), the variance of the product of N variables is given by
Hence we can write
Note that \(\prod \limits _{i \in \mathcal{N}_0}E_i^2 \) will be the value of one if \(\mathcal{N}_0 = \emptyset \) and zero otherwise. Therefore we can write \(\prod \limits _{i \in \mathcal{N}_0}E_i^2 = \prod \limits _{i \in \mathcal{N}_0}E_i^2 \times \prod \limits _{i \in \mathcal{N}_0}V_i^2\) resulting in
which gives
The proof is completed by placing the general expression for the integrand’s variance in (11) and (12) respectively. \(\square \)
Proof of Lemma 3.3
Due to conditional independence we have that
Moreover, from (14) we have that
By substituting (28) and (29) in (27), we obtain the variance of the joint estimator of Lemma 3.3.
Similarly, for the marginal estimator we have
Due to conditional independence we have that
Moreover, from Lemma 3.1 we have that
Substituting (31) and (32) in (30) gives the expression of the variance of the marginal estimator of Lemma 3.3. \(\square \)
Proof of Lemma 3.4
The proof of Lemma 3.4 can be obtained by induction. The statement of the Lemma holds for \(N=3\) with \(\varvec{Y}_3=(Y_1, Y_2, Y_3)\) since
which is true by the definition of TCI (see Eq. 19) for vectors \(\varvec{Y}\) of length equal to three.
Let us now assume that (21) it is true for any vector \(\varvec{Y}_N\) of length \(N > 3\). Then, for \(\varvec{Y}_{N+1}=( \varvec{Y}_{N}, Y_{N+1} ) = ( Y_{1}, \dots , Y_{N}, Y_{N+1} )\) the equation
is also true since
\(\square \)
Proof of Lemma 3.5
since \(E\left\{ TCI(\varvec{Y}) \Big [\prod _{i=1}^N Y_i-\prod _{i=1}^N E (Y_i)\Big ]\right\} =TCI(\varvec{Y}) E\Big [\prod _{i=1}^N Y_i-\prod _{i=1}^N E (Y_i)\Big ] = 0\). \(\square \)
Rights and permissions
About this article
Cite this article
Vitoratou, S., Ntzoufras, I. & Moustaki, I. Explaining the behavior of joint and marginal Monte Carlo estimators in latent variable models with independence assumptions. Stat Comput 26, 333–348 (2016). https://doi.org/10.1007/s11222-014-9495-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9495-8