×

Discriminative learning of Bayesian network parameters by differential evolution. (English) Zbl 1481.68039

Summary: This work proposes Differential Evolution (DE) to train parameters of Bayesian Networks (BN) for optimizing the Conditional Log-Likelihood (Discriminative Learning) instead of the log-likelihood (Generative Learning). Any given BN structure encodes assumptions about conditional independencies among the attributes and will result in error if they do not hold in the data. Such an error includes the classification dimension. The main goal of Discriminative learning is minimize this type of error. In this sense, although BN Discriminative Parameter Learning algorithms have been proposed, to the best of the authors’ knowledge, a meta-heuristic approach has not been devised yet. Thus, this is our main contribution: to come up with this kind of solution and evaluate its behavior so that its feasibility in this domain can be determined. Regarding the proposed method, the bias provided by the best solution in the population improves DE’s performance. DE variants based on JADE, such as L-SHADE and C-JADE, are especially recommended when introducing adaptation mechanisms of mutation and crossover parameters thus reducing the dependence on their calibration. L-SHADE is computationally recommended over any other DE variant. DE approach works well in essentially every standard situation, so DE approach is robust and at least as good, and often better, than the state-of-art method for Discriminative Learning called WANBIA. Our results suggest a potential benefit for Discriminative parameter learning with strong independence assumptions among attributes and that it typically produces more accurate classifiers than generative learning.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H22 Probabilistic graphical models
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)

Software:

C4.5; JADE; JADE
Full Text: DOI

References:

[1] Friedman, N.; Geiger, D.; Goldszmidt, M., Bayesian network classifiers, Mach. Learn., 29, 131-163 (1997) · Zbl 0892.68077
[2] Zaidi, N.; Webb, G.; Carman, M.; Petitjean, F.; Buntine, W.; Hynes, M.; Sterck, H., Efficient parameter learning of bayesian network classifiers, Mach. Learn., 106, 1289-1329 (2017) · Zbl 1454.62183
[3] Su, J.; Zhang, H.; Ling, C.; Matwin, S., Discriminative parameter learning for bayesian networks, In Proc. of the 25th international conference on Machine learning. ACM, 1016-1023 (2008)
[4] Price, K.; Storn, R., Minimizing the real functions of the icec’96 contest by differential evolution, in Proc. of IEEE C. Evol. Computat., 842-844 (1996)
[5] Tarkhaneh, O.; Shen, H., An adaptive differential evolution algorithm to optimal multi-level thresholding for MRI brain image segmentation, Expert Syst. Appl., 138, 112820 (2019)
[6] Bilal; Pant, M.; Zaheer, H.; Garcia-Hernandez, L.; Abraham, A., Differential evolution: a review of more than two decades of research, Eng. Appl. Artif. Intell., 90, 103479 (2020)
[7] Minsky, M., Steps toward artificial intelligence, Proc. IRE, 49, 1, 8-30 (1961)
[8] Keogh, E. J.; Pazzani, M. J., Learning the structure of augmented bayesian classifiers, Int. J. Artif. Intell. Tools, 11, 04, 587-601 (2002)
[9] Russell, S.; Norvig, P., Artificial intelligence: A Modern approach (2003), Prentice-Hall, Englewood Cliffs, NJ
[10] Scanagatta, M.; Salmerón, A.; Stella, F., A survey on bayesian network structure learning from data, Prog. Artif. Intell., 8, 425-439 (2019)
[11] Chow, C.; Liu, C., Approximating discrete probability distributions with dependence trees, IEEE Trans. Inf. Theory, 14, 462-467 (1968) · Zbl 0165.22305
[12] Sundararajan, P.; Mengshoel, O., A genetic algorithm for learning parameters in bayesian networks using expectation maximization, in Proc. of the 8th International Conference on Probabilistic Graphical Models, 511-522 (2016)
[13] Greiner, R.; Zhou, W., Structural extension to logistic regression: Discriminative parameter learning of belief net classifiers, AAAI/IAAI, 167-173 (2002)
[14] Shen, B.; Su, X.; Greiner, R.; Musilek, P.; Cheng, C., Discriminative parameter learning of general bayesian network classifiers, in Proc. Int. C. Tools Art., 296-305 (2003)
[15] Greiner, R.; Su, X.; Shen, B.; Zhou, W., Structural extension to logistic regression: discriminative parameter learning of belief net classifiers, Mach. Learn., 59, 297-322 (2005) · Zbl 1101.68759
[16] Greiner, R.; Grove, A. J.; Schuurmans, D., Learning bayesian nets that perform well, Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence. Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, UAI’97, 198-207 (1997), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
[17] Pernkopf, F.; Wohlmayr, M., On discriminative parameter learning of bayesian network classifiers, in Proc. ECML PKDD, 221-237 (2009)
[18] Zaidi, N. A.; Cerquides, J.; Carman, M. J.; Webb, G. I., Alleviating naive bayes attribute independence assumption by attribute weighting, J. Mach. Learn. Res., 14, 1947-1988 (2013) · Zbl 1317.68199
[19] Zhang, J.; Sanderson, A., Jade: adaptive differential evolution with optional external archive, IEEE Trans. Evol. Comput., 13, 945-958 (2009)
[20] Mezura-Montes, E.; Velázquez-Reyes, J.; Coello, C. A.C., A comparative study of differential evolution variants for global optimization, Proceedings of the 8th annual conference on Genetic and evolutionary computation - GECCO ’06, 485-492 (2006), ACM Press
[21] Tanabe, R.; Fukunaga, A., Improving the search performance of shade using linear population size reduction, in Proc. of IEEE C. Evol. Computat., 1658-1665 (2014)
[22] Gao, S.; Yu, Y.; Wang, Y.; Wang, J.; Cheng, J.; Zhou, M., Chaotic local search-based differential evolution algorithms for optimization, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 1-14 (2020)
[23] Kurgan, L.; Cios, K., CAIM Discretization algorithm, IEEE Trans. Knowl. Data Eng., 16, 2, 145-153 (2004)
[24] Breiman, L., Random forests, Mach. Learn., 45, 1, 5-32 (2001) · Zbl 1007.68152
[25] Ripley, B. D., Pattern recognition and neural networks (1996), Cambridge University Press · Zbl 0853.62046
[26] Quinlan, J. R., Improved use of continuous attributes in c4.5, Journal of Artificial Intelligence Research, 4, 77-90 (1996) · Zbl 0900.68112
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.