×

Generalized linear latent variable models with flexible distribution of latent variables. (English) Zbl 1253.62051

Summary: We consider a semi-nonparametric specification for the density of latent variables in Generalized Linear Latent Variable Models (GLLVM). This specification is flexible enough to allow for an asymmetric, multi-modal, heavy or light tailed smooth density. The degree of flexibility required by many applications of GLLVM can be achieved through this semi-nonparametric specification with a finite number of parameters estimated by maximum likelihood. Even with this additional flexibility, we obtain an explicit expression of the likelihood for conditionally normal manifest variables. We show by simulations that the estimated density of latent variables captures the true one with good degree of accuracy and is easy to visualize. By analysing two real data sets we show that a flexible distribution of latent variables is a useful tool for exploring the adequacy of the GLLVM in practice.

MSC:

62J12 Generalized linear models (logistic models)
62G07 Density estimation
65C05 Monte Carlo methods
Full Text: DOI

References:

[1] Akaike, A new look at the statistical model identification, IEEE Trans. Automat. Control 19 pp 716– (1974) · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[2] Anderson, The asymptotic normal distribution of estimators in factor analysis under general conditions, Ann. Statist. 16 pp 759– (1988) · Zbl 0646.62051 · doi:10.1214/aos/1176350834
[3] Bartholomew, Factor analysis for categorical data, J. Roy. Statist. Soc. Ser. B Statist. Methodol. 42 pp 293– (1980)
[4] Bartholomew, The foundations of factor analysis, Biometrika 71 pp 221– (1984) · Zbl 0575.62055 · doi:10.1093/biomet/71.2.221
[5] Bartholomew, The sensitivity of latent trait analysis to choice of prior distribution, British J. Math. Statist. Psych. 41 pp 101– (1988) · Zbl 0718.62255 · doi:10.1111/j.2044-8317.1988.tb00889.x
[6] Cagnone, Latent variable models for multivariate longitudinal ordinal respondes, British J. Math. Statist. Psych. 62 pp 401– (2009) · doi:10.1348/000711008X320134
[7] Chen, A Monte-Carlo EM algorithm for generalized mixed models with flexible random effects distribution, Biostatistics 3 pp 347– (2002) · Zbl 1135.62355 · doi:10.1093/biostatistics/3.3.347
[8] Davidian, The nonlinear mixed effects model with a smooth random effects density, Biometrika 80 pp 475– (1993) · Zbl 0788.62028 · doi:10.1093/biomet/80.3.475
[9] De Bruijn, Asymptotic methods in analysis (1981) · Zbl 0556.41021
[10] Drton, Likelihood ratio tests and singularities, Ann. Statist. 37 pp 979– (2009) · Zbl 1196.62020 · doi:10.1214/07-AOS571
[11] Dunson, Dynamic latent trait models for multidimensional longitudinal data, J. Amer. Statist. Assoc. 98 pp 555– (2003) · Zbl 1040.62100 · doi:10.1198/016214503000000387
[12] Fenton, Qualitative and asymptotic performance of the SNP density estimators, J. Econometrics. 74 pp 77– (1996) · Zbl 0866.62078 · doi:10.1016/0304-4076(95)01752-6
[13] Furrer , R. Nychka , D. Sain , S. 2009 Package fields [Computer software manual] http://www.R-project.org
[14] Gallant, Semi-nonparametric maximum likelihood estimation, Econometrica 55 pp 363– (1987) · Zbl 0631.62110 · doi:10.2307/1913241
[15] Gallant, Seminonparametric estimation of conditionally constrained heterogeneous processes: asset pricing applications, Econometrica 57 pp 1091– (1989) · Zbl 0679.62096 · doi:10.2307/1913624
[16] Gallant , A. R. Tauchen , G. 1992 A nonparametric approach to nonlinear time series analysis: estimation and simulation, part II New directions in time series analysis D. Brillinger P. Caines J. Geweke E. Parzen M. Rosenblatt M. S. Taqqu 45 71 92 Springer-Verlag
[17] Genz, An adaptive algorithm for numerical integration over an n-dimensional rectangular region, Int. J. Comput. Appl. Math. 6 pp 295– (1980) · Zbl 0443.65009 · doi:10.1016/0771-050X(80)90039-X
[18] Hannan, Rational transfer function approximation, Statist. Sci. 2 pp 135– (1987) · Zbl 0638.62091 · doi:10.1214/ss/1177013343
[19] Hastie, The elements of statistical learning (2001) · Zbl 0973.62007 · doi:10.1007/978-0-387-21606-5
[20] Hodges, Counting degrees of freedom in hierarchical and other richly-parametrised models, Biometrika 88 pp 367– (2001) · Zbl 0984.62045 · doi:10.1093/biomet/88.2.367
[21] Holzinger, A study in factor analysis: the stability of a bi-factor solution, Supplementary Educational Monographs 48 (1939)
[22] Huang, Latent-model robustness in structural measurement error models, Biometrika 93 pp 53– (2006) · Zbl 1152.62323 · doi:10.1093/biomet/93.1.53
[23] Huber, Estimation of generalized linear latent variable models, J. Roy. Statist. Soc. Ser. B Stat. Methodol. 66 pp 893– (2004) · Zbl 1060.62077 · doi:10.1111/j.1467-9868.2004.05627.x
[24] Jennrich, A note on Lawley’s formulas for standard errors in maximum likelihood factor analysis, Psychometrika 38 pp 571– (1973) · Zbl 0282.62050 · doi:10.1007/BF02291495
[25] Johnson , S. G. Narasimhan , B. 2009 Cubature: adaptive multivariate integration over hypercubes [Computer software manual]
[26] Jöreskog, Some contributions to maximum likelihood factor analysis, Psychometrika 32 pp 443– (1967) · Zbl 0183.24603 · doi:10.1007/BF02289658
[27] Jöreskog, A general approach to confirmatory maximum likelihood factor analysis, Psychometrika 34 pp 183– (1969) · doi:10.1007/BF02289343
[28] Jöreskog, LISREL 8: structural equation modeling with the SIMPLIS command language (1993)
[29] Knott, Bootstrapping the estimated latent distribution of the two-parameter latent trait model, British J. Math. Statist. Psych. 60 pp 175– (2007) · doi:10.1348/000711006X107539
[30] Laird, Nonparametric maximum likelihood estimation of a mixing distribution, J. Amer. Statist. Assoc. 73 pp 805– (1978) · Zbl 0391.62029 · doi:10.1080/01621459.1978.10480103
[31] Lawley, Some new results in maximum likelihood lactor analysis, Proceedings of the Royal Society of Edinburgh A67 pp 256– (1967) · Zbl 0161.38001
[32] Liu, Generalized spatial structural equation models, Biostatistics 6 pp 539– (2005) · Zbl 1169.62379 · doi:10.1093/biostatistics/kxi026
[33] Lu, Measuring the complexity of generalized hierarchical models, Canad. J. Statist. 35 pp 69– (2007) · Zbl 1219.62114 · doi:10.1002/cjs.5550350108
[34] Ma, Flexible class of skew-symmetric distributions, Scand. J. Statist. 31 pp 459– (2004) · Zbl 1063.62079 · doi:10.1111/j.1467-9469.2004.03_007.x
[35] Ma, Explicit estimating equations for semiparametric generalized linear latent variable models, J. Roy. Statist. Soc. Ser. B Statist. Methodol. 72 pp 475– (2010) · doi:10.1111/j.1467-9868.2010.00741.x
[36] Mardia, Multivariate analysis (1979)
[37] Mebane, Genetic optimization using derivatives: the rgenoud Package for R, J. Stat. Software 42 pp 1– (2011) · doi:10.18637/jss.v042.i11
[38] Montanari, Heteroscedastic factor mixture analysis, Statist. Model. 10 pp 441– (2010a) · doi:10.1177/1471082X0901000405
[39] Montanari, A skew-normal factor model for the analysis of student satisfaction towards university courses, J. Appl. Stat. 43 pp 473– (2010b) · doi:10.1080/02664760902736737
[40] Moustaki, Generalized latent trait models, Psychometrika 65 pp 391– (2000) · Zbl 1291.62236 · doi:10.1007/BF02296153
[41] Moustaki, Bounded-influence robust estimation in generalized linear latent variable models, J. Amer. Statist. Assoc. 101 pp 644– (2006) · Zbl 1119.62323 · doi:10.1198/016214505000001320
[42] Piessens, Quadpack a subroutine package for automatic integration (1983) · Zbl 0508.65005
[43] Pinheiro, Efficient Laplacian and Gaussian quadrature algorithms for multilevel generalized mixed models, J. Comput. Graph. Statist. 15 pp 58– (2006) · doi:10.1198/106186006X96962
[44] R Development Core Team 2011 R: a language and environment for statistical computing [Computer software manual] http://www.R-project.org
[45] Rabe-Hesketh, Parametrization of multivariate random effects models for categorical data, Biometrics 57 pp 1256– (2001) · Zbl 1209.62136 · doi:10.1111/j.0006-341X.2001.1256_1.x
[46] Rabe-Hesketh, Generalized latent variable modeling: multilevel, longitudinal, and structural equation models (2004) · Zbl 1097.62001
[47] Rabe-Hesketh, Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation, Statist. Model. 3 pp 215– (2003) · Zbl 1070.62024 · doi:10.1191/1471082X03st056oa
[48] Ramsey, Banff international research station for mathematical innovation and discovery, functional data analysis: future directions (2010, May)
[49] Rizopoulos, Generalized latent variable models with non-linear effect, British J. Math. Statist. Psych. 61 pp 415– (2008) · doi:10.1348/000711007X213963
[50] Schilling, High-dimensional maximum likelihood item factor analysis by adaptive quadrature, Psychometrika 70 pp 533– (2005) · Zbl 1306.62497
[51] Schwarz, Estimating the dimension of a model, Ann. Statist. 6 pp 461– (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[52] Scott, Multivariate density estimation (1992) · doi:10.1002/9780470316849
[53] Stetter, Numerical polynomial algebra, SIAM (2004) · Zbl 1058.65054
[54] Vaida, Conditional Akaike information for mixed-effects models, Biometrika 92 pp 351– (2005) · Zbl 1094.62077 · doi:10.1093/biomet/92.2.351
[55] Wall, Estimation for polynomial structural equation models, J. Amer. Statist. Assoc. 95 pp 929– (2000) · Zbl 0999.62094 · doi:10.1080/01621459.2000.10474283
[56] Wedel, Factor analysis with (mixed) observed and latent variables in the exponential family, Psychometrika 66 pp 515– (2001) · Zbl 1293.62261 · doi:10.1007/BF02296193
[57] Yung, Finite mixtures in confirmatory factor-analysis models, Psychometrika 62 pp 297– (1997) · Zbl 0890.62047 · doi:10.1007/BF02294554
[58] Zhang, Linear mixed models with flexible distributions of random effects for longitudinal data, Biometrics 57 pp 795– (2001) · Zbl 1209.62087 · doi:10.1111/j.0006-341X.2001.00795.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.