×

A focused information criterion for graphical models. (English) Zbl 1331.62057

Summary: A new method for model selection for Gaussian Bayesian networks and Markov networks, with extensions towards ancestral graphs, is constructed to have good mean squared error properties. The method is based on the focused information criterion, and offers the possibility of fitting individual-tailored models. The focus of the research, that is, the purpose of the model, directs the selection. It is shown that using the focused information criterion leads to a graph with small mean squared error. The low mean squared error ensures accurate estimation using a graphical model; here estimation rather than explanation is the main objective. Two situations that commonly occur in practice are treated: a data-driven estimation of a graphical model and the improvement of an already pre-specified feasible model. The search algorithms are illustrated by means of data examples and are compared with existing methods in a simulation study.

MSC:

62B10 Statistical aspects of information-theoretic topics
62A09 Graphical methods in statistics
65C60 Computational problems in statistics (MSC2010)

References:

[1] Abreu, G., Labouriau, R., Edwards, D.: High-dimensional graphical model search with the gRapHD R package. J. Stat. Softw. 37(1), 1-18 (2010) · doi:10.18637/jss.v037.i01
[2] Akaike, H.; Petrov, B. (ed.); Csáki, F. (ed.), Information theory and an extension of the maximum likelihood principle, 267-281 (1973), Budapest · Zbl 0283.62006
[3] Ali, R.A., Richardson, T., Spirtes, P.: Markov equivalence for ancestral graphs. Ann. Stat. 37(5B), 2808-2837 (2009) · Zbl 1178.68574 · doi:10.1214/08-AOS626
[4] Banerjee, O., El Ghaoui, L., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485-516 (2008) · Zbl 1225.68149
[5] Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14, 462-467 (1968) · Zbl 0165.22305 · doi:10.1109/TIT.1968.1054142
[6] Claeskens, G., Hjort, N.: The focused information criterion. J. Am. Stat. Assoc. 98, 900-916 (2003). With discussion and a rejoinder by the authors · Zbl 1045.62003
[7] Claeskens, G., Hjort, N.: Minimising average risk in regression models. Econom. Theory 24, 493-527 (2008a) · Zbl 1284.62454 · doi:10.1017/S0266466608080201
[8] Claeskens, G., Hjort, N.: Model Selection and Model Averaging. Cambridge University Press, Cambridge (2008b) · Zbl 1166.62001 · doi:10.1017/CBO9780511790485
[9] Cox, D.R., Wermuth, N.: Multivariate Dependencies: Models Analysis and Interpretation. Chapman & Hall, London (1996) · Zbl 0880.62124
[10] Dempster, A.: Covariance selection. Biometrics 28(1), 157-175 (1972) · doi:10.2307/2528966
[11] Dor, D., Tarsi, M.: A simple algorithm to construct a consistent extension of a partially oriented graph. Tech. Rep. (1992). · Zbl 1033.60008
[12] Drton, M., Perlman, M.: Model selection for Gaussian concentration graphs. Biometrika 91(3), 591-602 (2004) · Zbl 1108.62098 · doi:10.1093/biomet/91.3.591
[13] Drton, M., Perlman, M.: A SINful approach to Gaussian graphical model selection. J. Stat. Plann. Inference 138(4), 1179-1200 (2008) · Zbl 1130.62068 · doi:10.1016/j.jspi.2007.05.035
[14] Drton, M., Richardson, T.: Iterative conditional fitting for Gaussian ancestral graph models. In: Chickering D, Halpern J (eds) Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 130-137 (2004). · Zbl 1208.90131
[15] Edwards, D.: Introduction to Graphical Modelling, 2nd edn. Springer, New York (2000) · Zbl 0952.62003 · doi:10.1007/978-1-4612-0493-0
[16] Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432-441 (2008) · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[17] Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat. 19(1), 1-67 (1991) · Zbl 0765.62064 · doi:10.1214/aos/1176347963
[18] Gammelgaard Bøttcher, S.: Learning Bayesian Networks with Mixed Variables. PhD thesis, Aalborg University (2004).
[19] Grossman, D., Domingos, P.: Learning Bayesian network classifiers by maximizing conditional likelihood. In: Brodley C (ed) Proceedings of the 21st International Conference on Machine Learning (2004).
[20] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer Series in Statistics, Springer, New York, (2009) · Zbl 1273.62005
[21] Heckerman, D., Geiger, D.: Learning Bayesian networks: A unification for discrete and Gaussian domains. In: Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, pp. 274-284 (1995).
[22] Hjort, N., Claeskens, G., Hjort, N.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 879-899 (2003). With discussion and a rejoinder by the authors · Zbl 1047.62003
[23] Hjort, N., Claeskens, G.: Focussed information criteria and model averaging for Cox’s hazard regression model. J. Am. Stat. Assoc. 101, 1449-1464 (2006) · Zbl 1171.62350 · doi:10.1198/016214506000000069
[24] Hjort, N.L.: The exact amount of t-ness that the normal model can tolerate. J. Am. Stat. Assoc. 89, 665-675 (1994) · Zbl 0801.62035
[25] Jardine, N., van Rijsbergen, C.: The use of hierarchic clustering in information retrieval. Inf. Storage Retr. 7(5), 217-240 (1971) · doi:10.1016/0020-0271(71)90051-9
[26] Kalisch, M., Mächler, M., Colombo, D., Maathuis, M., Bühlmann, P.: Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11), 1-26 (2012) · doi:10.18637/jss.v047.i11
[27] Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009) · Zbl 1183.68483
[28] Krishnamurthy, V.; Ahipaşaoğlu, S.; d’Aspremont, A.; Sra, S. (ed.); Nowozin, S. (ed.); Wright, S. (ed.), A pathwise algorithm for covariance selection, 479-494 (2012), Cambridge
[29] Lauritzen, S.: Graphical Models. Oxford University Press, New York (1996) · Zbl 0907.62001
[30] Li, L., Toh, K.C.: An inexact interior point method for \[l_1\] l1-regularized sparse covariance selection. Math. Progr. Comput. 2(3-4), 291-315 (2010) · Zbl 1208.90131 · doi:10.1007/s12532-010-0020-6
[31] Mansour, J., Schwarz, R.: Molecular mechanisms for individualized cancer care. J. Am. Coll. Surg. 207(2), 250-258 (2008) · doi:10.1016/j.jamcollsurg.2008.03.003
[32] Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, London (1979) · Zbl 0432.62029
[33] Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436-1462 (2006) · Zbl 1113.62082 · doi:10.1214/009053606000000281
[34] Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc, San Francisco (1988) · Zbl 0746.68089
[35] Richardson, T., Spirtes, P.: Ancestral graph Markov models. Ann. Stat. 30(4), 962-1030 (2002) · Zbl 1033.60008 · doi:10.1214/aos/1031689015
[36] Schmidt, M., Niculescu-Mizil, A., Murphy, K.: Learning graphical model structure using \[l_1\] l1-regularization paths. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence, AAAI Press, pp. 1278-1283 (2007).
[37] Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461-464 (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[38] Scutari, M.: Learning bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1-22 (2010) · doi:10.18637/jss.v035.i03
[39] Shastry, B.S.: Pharmacogenetics and the concept of individualized medicine. Pharm. J. 6(1), 16-21 (2006)
[40] Spirtes, P.; Meek, C.; Richardson, T.; Glymour, C. (ed.); Cooper, G. (ed.), An algorithm for causal inference in the presence of latent variables and selection bias, 211-252 (1999), Cambridge
[41] Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search, 2nd edn. MIT Press, Cambridge (2000) · Zbl 0806.62001
[42] Tsamardinos, I., Brown, E.L., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. J. Mach. Learn. Res. 65(1), 31-78 (2006) · Zbl 1470.68192 · doi:10.1007/s10994-006-6889-7
[43] van ’t Veer, L., Bernards, R.: Enabling personalized cancer medicine through analysis of gene-expression patterns. Nature 452(7187), 564-570 (2008) · doi:10.1038/nature06915
[44] Whittaker, J.: Graphical Models in Applied Multivariate Statistics. John Wiley & Sons, Chichester (1990) · Zbl 0732.62056
[45] Williamson, J.: Bayesian Nets and Causality. Philosophical and Computational Foundations. Oxford University Press, Oxford (2005) · Zbl 1153.68503
[46] Witten, D.M., Friedman, J.H., Simon, N.: New insights and faster computations for the graphical lasso. J. Comput. Graph. Stat. 20(4), 892-900 (2011) · doi:10.1198/jcgs.2011.11051a
[47] Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrika 94(1), 19-35 (2007) · Zbl 1142.62408 · doi:10.1093/biomet/asm018
[48] Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172(16), 1873-1896 (2008) · Zbl 1184.68434 · doi:10.1016/j.artint.2008.08.001
[49] Zhang, X., Liang, H.: Focused information criterion and model averaging for generalized additive partial linear models. Ann. Stat. 39(1), 174-200 (2011) · Zbl 1209.62088 · doi:10.1214/10-AOS832
[50] Zhao, T., Liu, H., Roeder, K., Lafferty, J., Wasserman, L.: The huge package for high-dimensional undirected graph estimation in R. J. Mach. Learn. Res. 13, 1059-1062 (2012) · Zbl 1283.68311
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.