×

Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions. (English) Zbl 1420.92019

Summary: Gene regulatory networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.

MSC:

92C40 Biochemistry, molecular biology
92C42 Systems biology, networks
62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference

References:

[1] Acid, S. and L. M. de Campos (2003): “Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs,” J. Artif. Intell Res., 18, 445-490. · Zbl 1056.68142
[2] Acid, S., L. M. de Campos, J. M. Fernández-Luna, S. Rodríguez, J. M. Rodríguez and J. L. Salcedo (2004): “A comparison of learning algorithms for bayesian networks: a case study based on data from an emergency medical service,” Artif. Intell. Med., 30, 215-232.
[3] Acid, S., L. M. de Campos and M. Fernández (2013): “Score-based methods for learning markov boundaries by searching in onstrained spaces,” Data Min. Knowl. Disc., 26, 174-212. · Zbl 1260.68318
[4] Aderhold, A., D. Husmeier and M. Grzegorczyk (2014): “Statistical inference of regulatory networks for circadian regulation,” Stat. Appl. Genet. Mo. B., 13, 227-273. · Zbl 1296.92011
[5] Almasri, E., P. Larsen, G. Chen and Y. Dai (2008): “Incorporating literature knowledge in Bayesian network for inferring gene networks with gene expression data,” Lect. Notes Comput. Sc., 4983, 184-195.
[6] Banf, M. and S. Y. Rhee (2017): “Computational inference of gene regulatory networks: Approaches, limitations and opportunities,” Biochim. Biophys. Acta, 1860, 1, 41-52.
[7] Bansal, M., V. Belcastro, A. Ambesi-Impiombato and D. di Bernardo (2007): “How to infer gene networks from expression profiles,” Mol. Syst. Biol., 3, 1. doi:. · doi:10.1038/msb4100120
[8] Bellman, R. E. (1957): Dynamic programming, Princeton University Press, Princeton, New Jersey. · Zbl 0077.13605
[9] Buntine, W. (1991): “Theory refinement on Bayesian networks.” In: D’Ambrosio, Bruce D., Smets, Philippe & Bonissone, Piero P. (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 52-60. doi:. · doi:10.1016/B978-1-55860-203-8.50010-3
[10] Buntine, W. (1996): “A guide to the literature on learning probabilistic networks from data,” IEEE T. Knowl. Data En., 8, 195-210.
[11] Chai, L. E., S. K. Loh, S. T. Low, M. S. Mohamad, S. Deris and Z. Zakaria (2014): “A review on the computational approaches for gene regulatory network construction,” Comput. Biol. Med., 48, 55-65.
[12] Chen, G., M. J. Cairelli, H. Kilicoglu, D. Shin and T. C. Rindflesch (2014): “Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference,” PLoS Comput. Biol., 10, e1003666. doi:. · doi:10.1371/journal.pcbi.1003666
[13] Cheng, J., R. Greiner, J. Kelly, D. Bell and W. Liu (2002): “Learning Bayesian networks from data: An information-theory based approach,” Artif. Intell., 137, 43-90. · Zbl 0995.68114
[14] Chickering, D. M. (1995): “A transformational characterization of equivalent Bayesian network structures,” In: Besnard, Philippe & Hanks, Steve (eds.), Proceedings Uncertainty in Artificial Intelligence, pp. 87-98.
[15] Cho, R. J., M. J. Campbell, E. A. Winzeler, L. Steinmetz, A. Conway, L. Wodicka, T. G. Wolfsberg, A. E. Gabrielian, D. Landsman, D. J. Lockhart and R. W. Davis (1998): “A genome-wide transcriptional analysis of the mitotic cell cycle,” Mol. Cell., 2, 65-73.
[16] Chow, C. and C. Liu (1968): “Approximating discrete probability distributions with dependence trees,” IEEE T. Inform. Theory, 14, 462-467. · Zbl 0165.22305
[17] Cooper, G. F. and E. Herskovits. (1992): “A Bayesian method for the induction of probabilistic networks from data,” Mach. Learn., 9, 309-347. · Zbl 0766.68109
[18] de Campos, L. M. and J. G. Castellano (2007): “Bayesian network learning algorithms using structural restrictions,” Int. J. Approx. Reason., 45, 2, 233-254. · Zbl 1122.68104
[19] de Campos, L. M. and J. F. Huete (2000): “A new approach for learning belief networks using independence criteria,” Int. J. Approx. Reason., 24, 11-37. · Zbl 0995.68109
[20] Djebbari, A. and J. Quackenbush (2008): “Seeded Bayesian networks: constructing genetic networks from microarray data,” BMC Syst. Biol., 2, 57.
[21] Elvira Consortium (2002): “Elvira: An environment for probabilistic graphical models.” In: Gámez, J. and A. Salmerón, (eds.), Proceedings of the 1st European Workshop on Probabilistics Graphical Models, pp. 222-230.
[22] Esteves, G. H. and L. F. L. Reis (2018): “A statistical method for measuring activation of gene regulatory networks,” Stat. Appl. Genet. Mo. B., 17, 3. doi:. · Zbl 1398.92088 · doi:10.1515/sagmb-2016-0059
[23] Friedman, N. (2004): “Inferring cellular networks using probabilistic graphical models,” Science, 303, 5659, 799-805.
[24] Friedman, N., M. Linial, I. Nachman and D. Pe’er (2000): “Using Bayesian networks to analyze expression data,” J. Comput. Biol., 7, 601-620.
[25] Gámez, J. A., J. L. Mateo and J. M. Puerta (2011): “Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,” Data Min. Knowl. Disc., 22, 106-148. · Zbl 1235.68152
[26] Gifford, D. K. (2001): “Blazing pathways through genetic mountains,” Science, 293, 2049-2051.
[27] Good, I. J. (1965): The estimation of probabilities, The MIT Press, Cambridge, MA. · Zbl 0168.39603
[28] Hartemink, A. J.,D. K. Gifford, T. S. Jaakkola and R. A. Young (2001): “Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks,” Pac. Symp. Biocomput., 422-433. DOI: . · doi:10.1142/9789812799623_0041
[29] Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002a): “Bayesian methods for elucidating genetic regulatory networks,” IEEE Intell. Syst., 17, 37-43.
[30] Hartemink, A. J., D. K. Gifford, T. S. Jaakkola and R. A. Young (2002b): “Combining location and expression data for principled discovery of genetic regulatory network models,” Pac. Symp. Biocomput., 437-449. DOI: . · doi:10.1142/9789812799623_0041
[31] Heckerman, D., D. Geiger and D. M. Chickering (1995): “Learning Bayesian networks: The combination of knowledge and statistical data,” Mach. Learn., 20, 197-243. · Zbl 0831.68096
[32] Hoefsloot, H. C., S. Smit and A. K. Smilde (2008): “A classification model for the leiden proteomics competition,” Stat. Appl. Genet. Mo. B. 7, 2. doi:. · Zbl 1276.62075 · doi:10.2202/1544-6115.1351
[33] Hughes, T. R., M. J. Marton, A. R. Jones, C. J. Roberts, R. Stoughton, C. D. Armour, H. A. Bennett, E. Coffey, H. Dai, Y. D. He, M. J. Kidd, A. M. King, M. R. Meyer, D. Slade, P. Y. Lum, S. B. Stepaniants, D. D. Shoemaker, D. Gachotte, K. Chakraburtty, J. Simon, M. Bard and S. H. Friend (2000): “Functional discovery via a compendium of expression profiles,” Cell, 102, 109-126.
[34] Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” In: 2nd IEEE Computer Society Bioinformatics Conf., pp. 104-113.
[35] Imoto, S., S. Kim, T. Goto, S. Miyano, S. Aburatani, K. Tashiro and S. Kuhara (2003b): “Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network,” J. Bioinf. Comput. Biol. 1, 231-252.
[36] Imoto, S., T. Higuchi, T. Goto, K. Tashiro, S. Kuhara and S. Miyano (2004): “Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks,” J. Bioinf. Comput. Biol., 2, 77-98.
[37] Isci, S., H. Dogan, C. Ozturk and H. H. Otu (2014): “Bayesian network prior: network analysis of biological data using external knowledge,” Bioinformatics, 30, 860-867.
[38] Kanehisa, M., M. Araki, S. Goto, M. Hattori, M. Hirakawa, M. Itoh, T. Katayama, S. Kawashima, S. Okuda, T. Tokimatsu and Y. Yamanishi (2008): “KEGG for linking genomes to life and the environment,” Nucleic Acids Res., 36, 480-484.
[39] Kim, H., G. H. Golub and H. Park (2005): “Missing value estimation for DNA microarray gene expression data: local least squares imputation,” Bioinformatics, 21, 187-198.
[40] Lam, W. and F. Bacchus (1994): “Learning Bayesian belief networks: An approach based on the MDL principle,” Compu. Intell., 10, 269-293.
[41] Larsen, P., E. Almasri, G. Chen and Y. Dai (2007): “A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments,” BMC Bioinformatics, 8, 317. doi:. · doi:10.1186/1471-2105-8-317
[42] Le Phillip, P., A. Bahl and L. H. Ungar (2004): “Using prior knowledge to improve genetic network reconstruction from microarray data,” In Silico Biology, 4, 335-353.
[43] Lee, W. P. and W. S. Tzou (2009): “Computational methods for discovering gene networks from expression data,” Brief. Bioinform., 10, 408-423.
[44] Li, S., L. Wu and Z. Zhang (2006): “Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach,” Bioinformatics, 22, 2143-2150.
[45] Linde, J., S. Schulze, S. G. Henkel and R. Guthke (2015): “Data- and knowledge-based modeling of gene regulatory networks: an update,” Exp. and Clin. Sci., 14, 346-378.
[46] Markowetz, F. and R. Spang (2007): “Inferring cellular networks – a review,” BMC Bioinformatics, 8, 6. doi:. · doi:10.1186/1471-2105-8-S6-S5
[47] Mewes, H. W., D. Frishman, U. Güldener, G. Mannhaupt, K. F. X. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd and B. Weil (2002): “MIPS: a database for genomes and protein sequences,” Nucleic Acids Res., 30, 31-34.
[48] Mukherjee, S. and T. P. Speed (2008): “Network inference using informative priors,” PNAS, 105, 14313-14318.
[49] Nariai, N., S. Kim, S. Imoto and S. Miyano ( 2004): “Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks,” Pac. Symp. Biocomput., 336-347. DOI: . · doi:10.1142/9789812704856_0032
[50] Nariai, N., Y. Tamada, S. Imoto and S. Miyano (2005): “Estimating gene regulatory networks and protein-protein interactions of Saccharomyces cerevisiae from multiple genome-wide data,” Bioinformatics, 21, 206-212.
[51] Njah, H. and S. Jamoussi (2015): “Weighted ensemble learning of Bayesian network for gene regulatory networks,” Neurocomputing, 150, 404-416.
[52] Oates, C. J., R. Amos and S. E. Spencer (2014): “Quantifying the multi-scale performance of network inference algorithms,” Stat. Appl. Genet. Mo. B., 13, 611-631. · Zbl 1298.92043
[53] Pearl, J. (1988): Probabilistic reasoning in intelligent systems: Networks of plausible inference, Morgan Kaufmann Publishers Inc., San Francisco (CA).
[54] Sachs, K., O. Perez, D. Pe’er, D. A. Lauffenburger and G. P. Nolan (2005): “Causal protein-signaling networks derived from multiparameter single-cell data,” Science, 308, 523-529.
[55] Sauta, E., A. Demartini, F. Vitali, A. Riva and & R. Bellazzi (2017): “Data Fusion Approach for Learning Transcriptional Bayesian Networks,” Conf. on Artificial Intelligence in Medicine in Europe, 76-80. DOI: . · doi:10.1007/978-3-319-59758-4_8
[56] Schlitt, T. and A. Brazma (2007): “Current approaches to gene regulatory network modelling,” BMC Bioinformatics, 8, 6. doi:. · doi:10.1186/1471-2105-8-S6-S9
[57] Segal, E., H. Wang and D. Koller (2003): “Discovering molecular pathways from protein interaction and gene expression data,” Bioinformatics, 19, i264-i272.
[58] Spellman, P. T., G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein and B. Futcher (1998): “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Mol. Biol. Cell, 9, 3273-3297.
[59] Spirtes, P., C. Glymour and R. Scheines (1993): Causation, Prediction, and Search, MIT press, Cambridge, MA. · Zbl 0806.62001
[60] Spirtes, P., C. Glymour, R. Scheines, S. Kauffman, V. Aimale and F. Wimberly (2000): “Constructing Bayesian network models of gene expression networks from microarray data,” In: Proceedings of the Atlantic Symposium on Computational Biology, Genome Information Systems and Technology, doi:. · doi:10.1184/R1/6491291.v1
[61] Spirtes, P., C. Glymour and R. Scheines (2001): Causation, Prediction, and Search, 2nd Edition, MIT Press, Cambridge, MA. · Zbl 0981.62001
[62] Stacklies, W., H. Redestig, M. Scholz, D. Walther and J. Selbig (2007): “pcaMethods—a bioconductor package providing PCA methods for incomplete data,” Bioinformatics, 23, 1164-1167.
[63] Steele, E., A. Tucker, P. Hoen and M. Schuemie (2009): “Literature-based priors for gene regulatory networks,” Bioinformatics, 25, 1768-1774.
[64] Styczynski, M. P. and G. Stephanopoulos (2005): “Overview of computational methods for the inference of gene regulatory networks,” Comput. Chem. Eng., 29, 519-534.
[65] Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003a): “Combining gene expression data with DNA sequence information for estimating gene networks using Bayesian network model,” Genome Inform. Ser., 14, 352-353.
[66] Tamada, Y., S. Kim, H. Bannai, S. Imoto, K. Tashiro, S. Kuhara and S. Miyano (2003b): “Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection,” Bioinformatics, 19, 227-236.
[67] Wang, M., Z. Chen and S. Cloutier (2007): “A hybrid Bayesian network learning method for constructing gene networks,” Comput. Biol. Chem., 31, 361-372. · Zbl 1141.92319
[68] Wang, Y. R. and H. Huang (2014): “Review on statistical methods for gene network reconstruction using expression data,” J. Theor. Biol., 362, 53-61. · Zbl 1307.92099
[69] Werhli, A. V., M. Grzegorczyk and D. Husmeier (2006): “Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks,” Bioinformatics, 22, 2523-2531.
[70] Werhli, A. V. and D. Husmeier (2007a): “Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge,” Stat. Appl. Genet. Mo. B., 6, 1. doi:. · Zbl 1166.62373 · doi:10.2202/1544-6115.1282
[71] Werhli, A. V. and D. Husmeier (2007b): “Reverse engineering gene regulatory networks with Bayesian networks from expression data combined with multiple sources of biological prior knowledge,” Lect. N. Bioinformat., 49, 1-2.
[72] Wit, E. and J. Mcclure (2004): Statistics for microarrays, John Wiley & Sons, Chichester, UK. · Zbl 1049.62120
[73] Yu, J., V. A. Smith, P. P. Wang, A. J. Hartemink and E. D. Jarvis (2004): “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, 20, 3594-3603.
[74] Zhou, H. and T. Zheng (2014): “Bayesian hierarchical graph-structured model for pathway analysis using gene expression data,” Stat. Appl. Genet. Mo. B., 12, 393-412.
[75] Zhu, J., B. Zhang, E. N. Smith, B. Drees, R. B. Brem, L. Kruglyak, R. E. Bumgarner and E. E. Schadt (2008): “Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks,” Nat. Genet., 40, 854-861.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.