×

Efficient learning of quadratic variance function directed acyclic graphs via topological layers. (English) Zbl 07633319

Summary: Directed acyclic graph (DAG) models are widely used to represent casual relationships among random variables in many application domains. This article studies a special class of non-Gaussian DAG models, where the conditional variance of each node given its parents is a quadratic function of its conditional mean. Such a class of non-Gaussian DAG models are fairly flexible and admit many popular distributions as special cases, including Poisson, Binomial, Geometric, Exponential, and Gamma. To facilitate learning, we introduce a novel concept of topological layers, and develop an efficient DAG learning algorithm. It first reconstructs the topological layers in a hierarchical fashion and then recovers the directed edges between nodes in different layers, which requires much less computational cost than most existing algorithms in literature. Its advantage is also demonstrated in a number of simulated examples, as well as its applications to two real-life datasets, including an NBA player statistics data and a cosmetic sales data collected by Alibaba. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

References:

[1] Barabási, A.; Albert, R., “Emergence of Scaling in Random Networks, Science, 286, 509-512 (1999) · Zbl 1226.05223 · doi:10.1126/science.286.5439.509
[2] Brown, L. D.; Cai, T. T.; Zhou, H. H., “Nonparametric Regression in Exponential Families, The Annals of Statistics, 38, 2005-2046 (2010) · Zbl 1202.62050 · doi:10.1214/09-AOS762
[3] Bühlmann, P.; Peters, J.; Ernest, J., “CAM: Causal Additive Models, High-Dimensional Order Search and Penalized Regression, The Annals of Statistics, 42, 2526-2556 (2014) · Zbl 1309.62063 · doi:10.1214/14-AOS1260
[4] Chen, W. Y.; Drton, M.; Wang, Y. S., “On Causal Discovery with an Equal-Variance Assumption, Biometrika, 106, 973-980 (2019) · Zbl 1435.62166 · doi:10.1093/biomet/asz049
[5] Chickering, D. W., “Optimal Structure Identification with Greedy Search, The Journal of Machine Learning Research, 3, 507-554 (2003) · Zbl 1084.68519
[6] Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C., Introduction to Algorithms (2009), Cambridge, MA: MIT Press, Cambridge, MA · Zbl 1187.68679
[7] Erdös, P.; Rényi, A., “On the Evolution of Random Graphs,”, Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 5, 17-60 (1960) · Zbl 0103.16301
[8] Friedman, J.; Hastie, T.; Tibshirani, R., “Regularization Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, 33, 1-22 (2010) · doi:10.18637/jss.v033.i01
[9] Heckerman, D.; Geiger, D.; Chickering, D. M., “Learning Bayesian Networks: The Combination of Knowledge and Statistical Data, Machine Learning, 20, 197-243 (1995) · Zbl 0831.68096 · doi:10.1007/BF00994016
[10] Kalisch, M.; Bühlmann, P., “Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm,”, The Journal of Machine Learning Research, 8, 613-636 (2007) · Zbl 1222.68229
[11] Kalisch, M.; Mächler, M.; Colombo, D.; Maathuis, M. H.; Bühlmann, P., “Causal Inference Using Graphical Models with the R package pcalg, Journal of Statistical Software, 47, 1-26 (2012) · doi:10.18637/jss.v047.i11
[12] Koller, D.; Friedman, N., Probabilistic Graphical Models: Principles and Techniques (2009), Cambridge, MA: MIT Press, Cambridge, MA · Zbl 1183.68483
[13] Meinshausen, N.; Bühlmann, P., “High-Dimensional Graphs and Cariable Selection with the Lasso, The Annals of Statistics, 34, 1436-1462 (2006) · Zbl 1113.62082 · doi:10.1214/009053606000000281
[14] Morris, C. N., “Natural Exponential Families with Quadratic Variance Functions, The Annals of Statistics, 10, 65-80 (1982) · Zbl 0498.62015 · doi:10.1214/aos/1176345690
[15] Nandy, P.; Hauser, A.; Maathuis, M. H., “High-Dimensional Consistency in Score-Based and Hybrid Structure Learning, The Annals of Statistics, 46, 3151-3183 (2018) · Zbl 1411.62144 · doi:10.1214/17-AOS1654
[16] Park, G.; Park, S., High-Dimensional Poisson Structural Equation Model Learning via \(####\)-Regularized Regression, The Journal of Machine Learning Research, 18, 1-41 (2019) · Zbl 1434.68445
[17] Park, G.; Raskutti, G., “Learning Quadratic Variance Function DAG Models via Overdispersion Scoring,”, The Journal of Machine Learning Research, 18, 1-44 (2018) · Zbl 1470.68156
[18] Peters, J.; Bühlmann, P., “Identifiability of Gaussian Structural Equation Models with Equal Error Variances, Biometrika, 101, 219-228 (2014) · Zbl 1285.62005 · doi:10.1093/biomet/ast043
[19] Peters, J.; Mooij, J. M.; Janzing, D.; Schölkopf, B., “Causal Discovery with Continuous Additive Noise Models,”, The Journal of Machine Learning Research, 15, 2009-2053 (2014) · Zbl 1318.68151
[20] Sachs, K.; Perez, O.; Peer, D.; Lauffenburger, D. A.; Nolan, G. P., “Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data, Science, 308, 523-529 (2005) · doi:10.1126/science.1105809
[21] Sanford, A. D.; Moosa, I. A., “A Bayesian Network Structure for Operational Risk Modelling in Structured Finance Operations, Journal of the Operational Research Society, 63, 431-444 (2012) · doi:10.1057/jors.2011.7
[22] Scutari, M., “Learning Bayesian Networks with the bnlearn R package, Journal of Statistical Software, 35, 1-22 (2010) · doi:10.18637/jss.v035.i03
[23] Shimizu, S.; Hoyer, P. O.; Hyvärinen, A.; Kerminen, A. J., “A Linear Non-Gaussian Acyclic Model for Causal Discovery,”, The Journal of Machine Learning Research, 7, 2003-2030 (2006) · Zbl 1222.68304
[24] Shimizu, S.; Inazumi, T.; Sogawa, Y.; Hyvärinen, A.; Kawahara, Y.; Washio, T.; Hoyer, P. O.; Bollen, K., “DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model,”, The Journal of Machine Learning Research, 12, 1225-1248 (2011) · Zbl 1280.68195
[25] Spirtes, P.; Glymour, C. N.; Scheines, R., Causation, Prediction, and Search (2000), Cambridge, MA: MIT Press, Cambridge, MA
[26] Sun, W. W.; Wang, J. H.; Fang, Y. X., “Consistent Selection of Tuning Parameters via Variable Selection Stability, The Journal of Machine Learning Research, 14, 3419-3440 (2013) · Zbl 1318.62241
[27] Tsamardinos, I.; Brown, L. E.; Aliferis, C. F., “The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm, Machine Learning, 65, 31-78 (2006) · Zbl 1470.68192 · doi:10.1007/s10994-006-6889-7
[28] Wang, Y. S.; Drton, M., “High-Dimensional Causal Discovery Under Non-Gaussianity,”, Biometrika, 107, 41-59 (2020) · Zbl 1435.62213
[29] Yang, E.; Ravikumar, P.; Allen, G. I.; Liu, Z. D., “Graphical Models via Univariate Exponential Family Distributions,”, The Journal of Machine Learning Research, 16, 3813-3847 (2015) · Zbl 1351.62111
[30] Yuan, Y. P.; Shen, X. T.; Pan, W.; Wang, Z. Z., “Constrained Likelihood for Reconstructing a Directed Acyclic Gaussian Graph, Biometrika, 106, 109-125 (2019) · Zbl 1506.62298 · doi:10.1093/biomet/asy057
[31] Zheng, X.; Aragam, B.; Ravikumar, P. K.; Xing, E. P., DAGs with NO TEARS: Continuous Optimization for Structure Learning, Advances in Neural Information Processing Systems (NIPS, 9472-9483 (2018)
[32] Zhu, S. Y.; Ng, I.; Chen, Z. T., “Causal Discovery with Reinforcement Learning,” in International Conference on Learning Representations (ICLR) (2020)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.