Document Zbl 07897220

Laborda, Jorge D.; Torrijos, Pablo; Puerta, José M.; Gámez, José A.

A ring-based distributed algorithm for learning high-dimensional Bayesian networks. (English) Zbl 07897220

Bouraoui, Zied (ed.) et al., Symbolic and quantitative approaches to reasoning with uncertainty. 17th European conference, ECSQARU 2023, Arras, France, September 19–22, 2023. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 14294, 123-135 (2024).

Summary: Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although there are approaches based on horizontal (instances) or vertical (variables) partitioning in the literature, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. In this paper, we propose a directed ring-based distributed method that uses GES as the local learning algorithm, ensuring the same theoretical properties as GES but requiring less CPU time. The method involves partitioning the set of possible edges and constraining each processor in the ring to work only with its received subset. The global learning process is an iterative algorithm that carries out several rounds until a convergence criterion is met. In each round, each processor receives a BN from its predecessor in the ring, fuses it with its own BN model, and uses the result as the starting solution for a local learning process constrained to its set of edges. Subsequently, it sends the model obtained to its successor in the ring. Experiments were carried out on three large domains (400–1000 variables), demonstrating our proposal’s effectiveness compared to GES and its fast version (fGES).
For the entire collection see [Zbl 1543.68012].

MSC:

68T37

Reasoning under uncertainty in the context of artificial intelligence

Keywords:

Bayesian network learning; Bayesian network fusion/aggregation; distributed machine learning

Software:

bnlearn

Cite Review PDF

Full Text: DOI

References:

[1]	Alonso-Barba, J.I., delaOssa, L., Gámez, J.A., Puerta, J.M.: Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approximate Reasoning 54(4), 429-451 (2013). doi:10.1016/j.ijar.2012.09.004 · Zbl 1264.68123
[2]	Arias, J.; Gámez, JA; Puerta, JM, Structural learning of Bayesian networks via constrained hill climbing algorithms: adjusting trade-off between efficiency and accuracy, Int. J. Intell. Syst., 30, 3, 292-325, 2015 · doi:10.1002/int.21701
[3]	de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663-689 (2011). http://jmlr.org/papers/v12/decampos11a.html · Zbl 1280.68226
[4]	de Campos, L.M.: A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J. Mach. Learn. Res. 7, 2149-2187 (2006). http://jmlr.org/papers/v7/decampos06a.html · Zbl 1222.62036
[5]	Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3(Nov), 507-554 (2002). http://www.jmlr.org/papers/v3/chickering02b.html · Zbl 1084.68519
[6]	Chickering, D.M., Heckerman, D., Meek, C.: Large-sample learning of bayesian networks is np-hard. J. Mach. Learn. Res. 5, 1287-1330 (2004). https://www.jmlr.org/papers/v5/chickering04a.html · Zbl 1222.68169
[7]	Gámez, JA; Mateo, JL; Puerta, JM, Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood, Data Mining Knowl. Discov., 22, 1, 106-148, 2011 · Zbl 1235.68152 · doi:10.1007/s10618-010-0178-6
[8]	Heckerman, D.; Geiger, D.; Chickering, D., Learning Bayesian networks: the combination of knowledge and statistical data, Mach. Learn., 20, 197-243, 1995 · Zbl 0831.68096 · doi:10.1007/BF00994016
[9]	Jensen, FV; Nielsen, TD, Bayesian Networks and Decision Graphs, 2007, New York: Springer, New York · Zbl 1277.62007 · doi:10.1007/978-0-387-68282-2
[10]	de Jongh, M., Druzdzel, M.J.: A comparison of structural distance measures for causal Bayesian network models. In: Klopotek, M., Przepiorkowski, A., Wierzchon, S.T., Trojanowski, K. (eds.) Recent Advances in Intelligent Information Systems, Challenging Problems of Science, Computer Science series, pp. 443-456. Academic Publishing House EXIT (2009). doi:10.1007/978-3-030-34152-7
[11]	Kim, G-H; Kim, S-H, Marginal information for structure learning, Stat. Comput., 30, 2, 331-349, 2019 · Zbl 1436.62210 · doi:10.1007/s11222-019-09877-x
[12]	Kjaerulff, UB; Madsen, AL, Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis, 2013, New York: Springer, New York · Zbl 1257.68008 · doi:10.1007/978-1-4614-5104-4
[13]	Koller, D.; Friedman, N., Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning, 2009, Cambridge: The MIT Press, Cambridge · Zbl 1183.68483
[14]	Krier, C., François, D., Rossi, F., Verleysen, M.: Feature clustering and mutual information for the selection of variables in spectral data, pp. 157-162 (2007)
[15]	Li, T.; Sahu, AK; Talwalkar, A.; Smith, V., Federated learning: challenges, methods, and future directions, IEEE Signal Process. Mag., 37, 3, 50-60, 2020 · doi:10.1109/MSP.2020.2975749
[16]	Lin, Y., Druzdzel, M.J.: Computational advantages of relevance reasoning in Bayesian belief networks. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 342-350. UAI 1997, Morgan Kaufmann Publishers Inc. (1997)
[17]	Peña, J., Finding consensus Bayesian network structures, J. Artif. Intell. Res. (JAIR), 42, 661-687, 2011 · Zbl 1234.68344 · doi:10.1613/jair.3427
[18]	Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, 1988, San Francisco: Morgan Kaufmann Publishers Inc., San Francisco
[19]	Puerta, JM; Aledo, JA; Gámez, JA; Laborda, JD, Efficient and accurate structural fusion of Bayesian networks, Inf. Fusion, 66, 155-169, 2021 · doi:10.1016/j.inffus.2020.09.003
[20]	Ramsey, J.; Glymour, M.; Sanchez-Romero, R.; Glymour, C., A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images, Int. J. Data Sci. Anal., 3, 2, 121-129, 2016 · doi:10.1007/s41060-016-0032-z
[21]	Scanagatta, M., Campos, C.P.D., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 2, pp. 1864-1872. NIPS 2015, MIT Press (2015)
[22]	Scanagatta, M.; Salmerón, A.; Stella, F., A survey on Bayesian network structure learning from data, Progress Artif. Intell., 8, 4, 425-439, 2019 · doi:10.1007/s13748-019-00194-y
[23]	Scutari, M., Learning Bayesian networks with the bnlearn R package, J. Stat. Softw., 35, 3, 1-22, 2010 · doi:10.18637/jss.v035.i03
[24]	Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks, pp. 584-590. UAI 2005, AUAI Press, Arlington, Virginia, USA (2005)
[25]	Tsamardinos, I.; Brown, LE; Aliferis, CF, The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn., 65, 1, 31-78, 2006 · Zbl 1470.68192 · doi:10.1007/s10994-006-6889-7

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.