×

Marginal information for structure learning. (English) Zbl 1436.62210

Summary: Structure learning for Bayesian networks has been made in a heuristic mode in search of an optimal model to avoid an explosive computational burden. In the learning process, a structural error which occurred at a point of learning may deteriorate its subsequent learning. We proposed a remedial approach to this error-for-error process by using marginal model structures. The remedy is made by fixing local errors in structure in reference to the marginal structures. In this sense, we call the remedy a marginally corrective procedure. We devised a new score function for the procedure which consists of two components, the likelihood function of a model and a discrepancy measure in marginal structures. The proposed method compares favourably with a couple of the most popular algorithms as shown in experiments with benchmark data sets.

MSC:

62H17 Contingency tables
62-08 Computational methods for problems pertaining to statistics
68T05 Learning and adaptive systems in artificial intelligence

Software:

bnlearn; Hailfinder
Full Text: DOI

References:

[1] Abramson, B.; Brown, J.; Edwards, W.; Murphy, A.; Winkler, R., Hailfinder: a Bayesian system for forecasting severe weather, Int. J. Forecast., 12, 1, 57-71 (1996)
[2] Acid, S.; De Campos, L., Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs, J. Artif. Intell. Res., 18, 445-490 (2003) · Zbl 1056.68142
[3] Amirkhani, H.; Rahmati, M.; Lucas, P.; Hommersom, A., Exploiting experts knowledge for structure learning of Bayesian networks, IEEE Trans. Pattern Anal. Mach. Intell., 39, 11, 2154-2170 (2017)
[4] Beinlich, I.; Suermondt, H.; Chavez, R.; Cooper, G., The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks, Second European Conference on Artificial Intelligence in Medicine, 38, 247-256 (1989)
[5] Binder, J.; Koller, D.; Russell, S.; Kanazawa, K., Adaptive probabilistic networks with hidden variables, Mach. Learn., 29, 2-3, 213-244 (1997) · Zbl 0892.68079
[6] Birch, M., Maximum likelihood in three-way contingency tables, J. R. Stat. Soc., 25, 220-223 (1963) · Zbl 0121.14001
[7] Birch, M., The detection of partial association I: the \(2\times 2\) case, J. R. Stat. Soc., 26, 313-324 (1964) · Zbl 0127.10208
[8] Buntine, W., Theory refinement on Bayesian networks, Proc. Uncertain. Artif. Intell., 7, 52-60 (1991)
[9] Chen, X.; Anantha, G.; Wang, X., An effective structure learning method for constructing gene networks, Bioinformatics, 22, 1367-1374 (2006)
[10] Chickering, David Maxwell, Learning Bayesian Networks is NP-Complete, Learning from Data, 121-130 (1996), New York, NY: Springer New York, New York, NY
[11] Chickering, D., Optimal structure identification with greedy search, J. Mach. Learn. Res., 3, 507-554 (2002) · Zbl 1084.68519
[12] Chickering, D., Geiger, D., Heckerman, D.: Learning Bayesian networks: search methods and experimental results. In: Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, pp. 112-128 (1995) · Zbl 0831.68096
[13] Cochran, W., The chi-square test of goodness of fit, Ann. Math. Stat., 23, 315-345 (1952) · Zbl 0047.13105
[14] Cooper, G.; Herskovitz, E., A Bayesian method for the induction of probabilistic networks from data, Mach. Learn., 9, 309-347 (1992) · Zbl 0766.68109
[15] Darroach, J.; Lauritzen, S.; Speed, T., Markov fields and log-linear interaction models for contingency tables, Ann. Stat., 8, 3, 522-539 (1980) · Zbl 0444.62064
[16] Dawid, A.; Lauritzen, S., Hyper Markov laws in the statistical analysis of decomposable graphical models, Ann. Stat., 21, 3, 1272-1317 (1993) · Zbl 0815.62038
[17] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum Likelihood from Incomplete Data Via theEMAlgorithm, Journal of the Royal Statistical Society: Series B (Methodological), 39, 1, 1-22 (1977) · Zbl 0364.62022
[18] Diez, F.; Mira, J.; Iturralde, E.; Zybillaga, S., DIAVAL, a Bayesian expert system for echocardiography, Artif Intell. Med., 10, 1, 59-73 (1997)
[19] Fienberg, S.; Kim, S., Combining conditional log-linear structures, J. Am. Stat. Assoc., 94, 455, 229-239 (1999) · Zbl 1072.62588
[20] Friedman, M., The use of ranks to avoid the assumption of normality implicit in the analysis of variance, J. Am. Stat. Assoc., 32, 200, 675-701 (1937) · JFM 63.1098.02
[21] Friedman, N.; Goldszmidt, M.; Wyner, A., Data analysis with Bayesian networks: a bootstrap approach, Proc. Uncertain. Artif. Intell., 15, 196-201 (1999)
[22] Friedman, N.; Linial, M.; Nachman, I.; Pe’Er, D., Using Bayesian networks to analyze expression data, J. Comput. Biol., 7, 3-4, 601-620 (2000)
[23] Gámez, J.; Mateo, J.; Puerta, J., Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood, Data Min. Knowl. Discov., 22, 1-2, 106-148 (2011) · Zbl 1235.68152
[24] Goh, K.; Cusick, M.; Valle, D.; Childs, B.; Vidal, M.; Barabasi, A., The human disease network, Proc. Natl. Acad. Sci., 104, 8685-8690 (2007)
[25] Heckerman, D.; Geiger, D.; Chickering, D., Learning Bayesian networks: the combination of knowledge and statistical data, Proc. Uncertain. Artif. Intell., 10, 293-301 (1994)
[26] Jiang, C., Leong, T., Poh, K.: PGMC: a framework for probabilistic graphical model combination. In: AMIA Annual Symposium Proceedings, pp. 370-374 (2005)
[27] Kim, S., Conditional log-linear structures for log-linear modelling, Comput. Stat. Data Anal., 50, 8, 2044-2064 (2006) · Zbl 1445.62124
[28] Kim, Sung-Ho, Properties of Markovian Subgraphs of a Decomposable Graph, Lecture Notes in Computer Science, 15-26 (2006), Berlin, Heidelberg: Springer Berlin Heidelberg, Berlin, Heidelberg
[29] Kim, Sung-Ho; Lee, Sangjin, Searching Model Structures Based on Marginal Model Structures, New Developments in Robotics Automation and Control (2008) · Zbl 1157.68405
[30] Koller, D.; Friedman, N., Probabilistic Graphical Models: Principles and Techniques (2009), Cambridge: MIT Press, Cambridge · Zbl 1183.68483
[31] Koller, D., Sahami, M.: Toward optimal feature selection. In: The 13th International Conference on Machine Learning, pp. 284-292 (1996)
[32] Koster, J., Marginalizing and conditioning in graphical models, Bernoulli, 8, 817-840 (2002) · Zbl 1011.60026
[33] Kullback, S.; Leibler, R., Information and sufficiency, Ann. Math. Stat., 22, 79-86 (1951) · Zbl 0042.38403
[34] Larrañaga, P.; Poza, M.; Yurramendi, Y.; Murga, R.; Kuijpers, C., Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters, IEEE Trans. Pattern Anal. Mach. Intell., 18, 9, 912-926 (1996)
[35] Lauritzen, S., Graphical Models (1996), Oxford: Clarendon Press, Oxford · Zbl 0907.62001
[36] Lauritzen, S., Spiegelhalter, D.: Local computation with probabilities on graphical structures and their application to expert systems (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 50(2), 157-224 (1988) · Zbl 0684.68106
[37] Margaritis, D.; Thrun, S., Bayesian network induction via local neighborhoods, Adv. Neural Inf. Process. Syst., 12, 505-511 (1999)
[38] Massa, M.; Lauritzen, S., Combining statistical models, Contemp. Math., 516, 239-259 (2010) · Zbl 1196.62004
[39] Pearl, J.: Bayesian networks: a model of self-activated memory for evidential reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, vol. 7, pp. 329-334 (1985)
[40] Pearl, J., Probabilistic Reasoning in Intelligent Systems (1988), San Mateo, CA: Morgan Kaufmann, San Mateo, CA
[41] Richardson, M., Domingos, P.: Learning with knowledge from multiple experts. In: Proceedings of the 20th International Conference on Machine Learning, vol. 20, pp. 624-631 (2003)
[42] Richardson, T.; Spirtes, P., Ancestral graph Markov models, Ann. Stat., 30, 4, 962-1030 (2002) · Zbl 1033.60008
[43] Robinson, R.: Counting labeled acyclic digraphs. In: New Directions in the Theory of Graphs, pp. 239-273 (1973) · Zbl 0259.05116
[44] Scutari, M., Learning Bayesian networks with the bnlearn R package, J. Stat. Softw., 35, 3, 1-22 (2010) · doi:10.18637/jss.v035.i03
[45] Spirtes, P.; Glymour, C.; Scheines, R.; Tiles, J.; Mckee, G.; Dean, G., Causality from probability, Evolving Knowledge in the Natural and Behavioral Sciences, 181-199 (1990), London: Pitman, London
[46] Tillman, R., Danks, D., Glymour, C.: Integrating locally learned causal structures with overlapping variables. In: Advances in Neural Information Processing Systems (NIPS 2008), vol. 21, pp. 1-8 (2008)
[47] Trucco, P.; Cagno, E.; Ruggeri, F.; Grande, O., A Bayesian belief network modelling of organisational factors in risk analysis: a case study in maritime transportation, Reliab. Eng. Syst. Saf., 93, 6, 845-856 (2008)
[48] Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov blanket discovery. In: The 16th International FLAIRS Conference, pp. 376-381 (2003)
[49] Tsamardinos, I.; Brown, L.; Aliferis, C., The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn., 65, 1, 31-78 (2006) · Zbl 1470.68192
[50] Tsamardinos, I.; Triantafillou, S.; Lagani, V., Towards integrative causal analysis of heterogeneous data sets and studies, J. Mach. Learn. Res., 13, 1097-1157 (2012) · Zbl 1283.62131
[51] Verma, T.; Pearl, J., Causal networks: semantics and expressiveness, Uncertain. Artif. Intell., 5, 69-76 (1990)
[52] Verma, T.; Pearl, J., Equivalence and synthesis of causal models, Uncertain. Artif. Intell., 6, 220-227 (1991)
[53] Whittaker, J., Graphical Models (1990), New York: Wiley, New York
[54] Wilcoxon, F., Individual comparisons by ranking methods, Biom. Bull., 1, 6, 80-83 (1945)
[55] Yang, L.; Lee, J., Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems, Robot. Comput. Integr. Manuf., 28, 28, 66-74 (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.