×

Half-trek criterion for generic identifiability of linear structural equation models. (English) Zbl 1257.62059

Summary: A linear structural equation model relates random variables of interest and corresponding Gaussian noise terms via a linear equation system. Each such model can be represented by a mixed graph in which directed edges encode the linear equations and bidirected edges indicate possible correlations among noise terms. We study parameter identifiability in these models, that is, we ask for conditions that ensure that the edge coefficients and correlations appearing in a linear structural equation model can be uniquely recovered from the covariance matrix of the associated distribution. We treat the case of generic identifiability, where unique recovery is possible for almost every choice of parameters. We give a new graphical condition that is sufficient for generic identifiability and can be verified in time that is polynomial in the size of the graph. It improves criteria from prior work and does not require the directed part of the graph to be acyclic. We also develop a related necessary condition and examine the “gap” between sufficient and necessary conditions through simulations on graphs with \(25\) or \(50\) nodes, as well as exhaustive algebraic computations for graphs with up to five nodes.

MSC:

62H05 Characterization and structure theory for multivariate probability distributions; copulas
62J05 Linear regression; mixed models
05C90 Applications of graph theory
62H20 Measures of association (correlation, canonical correlation, etc.)
65C60 Computational problems in statistics (MSC2010)

Software:

R; SINGULAR; Matlab; TETRAD

References:

[1] Bollen, K. A. (1989). Structural Equations with Latent Variables . Wiley, New York. · Zbl 0731.62159
[2] Brito, C. (2004). Graphical methods for identification in structural equation models. Ph.D. thesis, UCLA Computer Science Dept.
[3] Brito, C. and Pearl, J. (2002a). A new identification condition for recursive models with correlated errors. Struct. Equ. Model. 9 459-474. · doi:10.1207/S15328007SEM0904_1
[4] Brito, C. and Pearl, J. (2002b). A graphical criterion for the identification of causal effects in linear models. In Proceedings of the Eighteenth National Conference on Artificial Intelligence ( AAAI ) 533-538. AAAI press, Palo Alto, CA.
[5] Brito, C. and Pearl, J. (2006). Graphical condition for identification in recursive SEM. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (R. Dechter and T. S. Richardson, eds.) 47-54. AUAI Press, Arlington, VA.
[6] Chan, H. and Kuroki, M. (2010). Using descendants as instrumental variables for the identification of direct causal effects in linear SEMs. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (Y. W. Teh and M. Titterington, eds.). J. Mach. Learn. Res. ( JMLR ), Workshop and Conference Proceedings 9 73-80. Available at .
[7] Cormen, T. H., Leiserson, C. E., Rivest, R. L. and Stein, C. (2001). Introduction to Algorithms , 2nd ed. MIT Press, Cambridge, MA. · Zbl 1047.68161
[8] Cox, D., Little, J. and O’Shea, D. (2007). Ideals , Varieties , and Algorithms , 3rd ed. Springer, New York.
[9] Decker, W., Greuel, G.-M., Pfister, G. and Schönemann, H. (2011). Singular 3-1-3-A computer algebra system for polynomial computations. Available at . · Zbl 0902.14040
[10] Didelez, V., Meng, S. and Sheehan, N. A. (2010). Assumptions of IV methods for observational epidemiology. Statist. Sci. 25 22-40. · Zbl 1328.62587 · doi:10.1214/09-STS316
[11] Drton, M., Foygel, R. and Sullivant, S. (2011). Global identifiability of linear structural equation models. Ann. Statist. 39 865-886. · Zbl 1215.62052 · doi:10.1214/10-AOS859
[12] Evans, W. N. and Ringel, J. S. (1999). Can higher cigarette taxes improve birth outcomes? Journal of Public Economics 72 135-154.
[13] Ford, L. R. Jr. and Fulkerson, D. R. (1962). Flows in Networks . Princeton Univ. Press, Princeton, NJ. · Zbl 0106.34802
[14] Foygel, R., Draisma, J. and Drton, M. (2012). Supplement to “Half-trek criterion for generic identifiability of linear structural equation models.” . · Zbl 1257.62059
[15] Garcia-Puente, L. D., Spielvogel, S. and Sullivant, S. (2010). Identifying causal effects with computer algebra. In Proceedings of the Twenty-sixth Conference on Uncertainty in Artificial Intelligence ( UAI ) (P. Grünwald and P. Spirtes, eds.). AUAI Press.
[16] MathWorks Inc. (2010). MATLAB version 7.10.0 (R2010a). Natick, MA.
[17] Okamoto, M. (1973). Distinctness of the eigenvalues of a quadratic form in a multivariate sample. Ann. Statist. 1 763-765. · Zbl 0261.62043 · doi:10.1214/aos/1176342472
[18] Pearl, J. (2000). Causality : Models , Reasoning , and Inference . Cambridge Univ. Press, Cambridge. · Zbl 0959.68116
[19] R Development Core Team. (2011). R : A language and environment for statistical computing . R Foundation for Statistical Computing, Vienna, Austria.
[20] Richardson, T. and Spirtes, P. (2002). Ancestral graph Markov models. Ann. Statist. 30 962-1030. · Zbl 1033.60008 · doi:10.1214/aos/1031689015
[21] Schrijver, A. (2004). Combinatorial Optimization. Polyhedra and Efficiency. Algorithms and Combinatorics 24 A . Springer, Berlin. · Zbl 1072.90030
[22] Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation , Prediction , and Search , 2nd ed. MIT Press, Cambridge, MA. · Zbl 0806.62001
[23] Sullivant, S., Talaska, K. and Draisma, J. (2010). Trek separation for Gaussian graphical models. Ann. Statist. 38 1665-1685. · Zbl 1189.62091 · doi:10.1214/09-AOS760
[24] Tian, J. (2005). Identifying direct causal effects in linear models. In Proceedings of the Twentieth National Conference on Artificial Intelligence ( AAAI ) 346-353. AAAI press, Palo Alto, CA.
[25] Tian, J. (2009). Parameter identification in a class of linear structural equation models. In Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence ( IJCAI ) 1970-1975. AAAI press, Palo Alto, CA.
[26] Wermuth, N. (2011). Probability distributions with summary graph structure. Bernoulli 17 845-879. · Zbl 1245.62062 · doi:10.3150/10-BEJ309
[27] Wright, S. (1921). Correlation and causation. J. Agricultural Research 20 557-585.
[28] Wright, S. (1934). The method of path coefficients. Ann. Math. Statist. 5 161-215. · Zbl 0010.31305 · doi:10.1214/aoms/1177732676
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.