×

Loop-based conic multivariate adaptive regression splines is a novel method for advanced construction of complex biological networks. (English) Zbl 1403.92089

Summary: The Gaussian graphical model (GGM) and its Bayesian alternative, called, the Gaussian copula graphical model (GCGM) are two widely used approaches to construct the undirected networks of biological systems. They define the interactions between species by using the conditional dependencies of the multivariate normality assumption. However, when the system’s dimension is high, the performance of the model becomes computationally demanding, and, particularly, the accuracy of GGM decreases when the observations are far from normality. Here, we suggest a conic multivariate adaptive regression splines (CMARS) as an alternative to GGM and GCGM to ameliorate both problems. CMARS is a modified version of the multivariate adaptive regression spline, a well-known modeling approaches used in operational research (OR) to represent biological, environmental, and economic data. The main benefit of this model is its compatibility with high-dimensional and correlated measurements of serious nonlinearity, which allows for a wide field of application. We adapted CMARS to describe biological systems and called it “LCMARS” due to its loop-based description. We then applied LCMARS to simulated and real datasets, whereby LCMARS exhibited more accurate results compared to GGM and GCGM. Hereby, the ability to use LCMARS in the description of biological networks has the potential to open up new avenues in the application of OR to computational biology and bioinformatics and can thus help to better understand complex diseases like cancer and hepatitis.

MSC:

92C42 Systems biology, networks
62P10 Applications of statistics to biology and medical sciences; meta analysis
62G08 Nonparametric regression and quantile regression
62H99 Multivariate analysis
90B15 Stochastic network models in operations research
Full Text: DOI

References:

[1] Albert, R.; Barabasi, A. L., Statistical mechanics of complex networks, Review of Modern Physics, 74, 47-97, (2002) · Zbl 1205.82086
[2] Azevedo, N.; Pinheiro, D.; Weber, G. W., Dynamic programming for a Markov-switching jump-diffusion, Journal of Computational and Applied Mathematics, 267, 1-19, (2014) · Zbl 1293.49055
[3] Banga, J. R., Optimization in computational systems biology, BMC Systems Biology, 2, 47, 1-7, (2008)
[4] Barabasi, A. L.; Oltvai, Z. N., Network biology: understanding the cell’s functional organization, Nature Reviews Genetics, 5, 101-113, (2004)
[5] Barron, A. R.; Xiao, X., Discussion: multivariate adaptive regression splines, The Annals of Statistics, 19, 1, 67-82, (1991)
[6] Bhadra, A.; Mallick, B. K., Joint high-dimensional Bayesian variable and covariance selection with an application to eqtl analysis, Biometrics, 69, 2, 447-457, (2013) · Zbl 1274.62722
[7] Bower, J. M.; Bolouri, H., Computational modelling of genetic and biochemical networks, (2001), Massachusetts Institute of Technology Massachusetts
[8] Brown, C., Differential equations: A modeling approach, (2007), SAGE Publications California · Zbl 1154.34001
[9] Carlin, B. P.; Louis, T. A., Bayes and empirical Bayes methods for data analysis, (2000), Chapman and Hall Florida · Zbl 1017.62005
[10] Chen, J.; Chen, Z., Extended Bayesian information criteria for model selection with large model spaces, Biometrika Trust, 95, 3, 759-771, (2008) · Zbl 1437.62415
[11] Cheung, N. J.; Xu, Z. K.; Ding, X. M.; Shen, H. B., Modeling nonlinear dynamic biological systems with human-readable fuzzy rules optimized by convergent heterogeneous particle swarm, European Journal of Operational Research, 247, 349-358, (2015) · Zbl 1346.92010
[12] Defterli, O.; Fgenschuh, A.; Weber, G. W., Modern tools for the time-discrete dynamics and optimization of gene-environment networks, Communications in Nonlinear Science and Numerical Simulation, 16, 4768-4779, (2011) · Zbl 1235.93023
[13] Defterli, O.; Purutçuoğlu, V.; Weber, G. W., Advanced mathematical and statistical tools in the dynamic modelling and simulation of gene-environment networks, (Zilberman, D.; Pinto, A., Chapter in: Modeling, optimization, dynamics and bioeconomy, (2014), Springer-Verlag), 235-257
[14] Dobra, A.; Lenkoski, A., Copula Gaussian graphical models and their application to modeling functional disability data, Annals of Applied Statistics, 5, 2A, 969-993, (2011) · Zbl 1232.62046
[15] Ergenc, T.; Weber, G. W., Modeling and prediction of gene-expression patterns reconsidered with Runge-Kutta discretization, Journals of Computational Technologies, 9, 40-48, (2005) · Zbl 1060.92045
[16] Fang, Z., Tian, W., & Ji, H. (2015). The GANPA datasets package. R package version 1.0.; Fang, Z., Tian, W., & Ji, H. (2015). The GANPA datasets package. R package version 1.0.
[17] Foygel, R.; Drton, M., Extended Bayesian information criteria for Gaussian graphical models, Advances in Neural Information Processing Systems, 23, 2020-2028, (2010)
[18] Friedman, J. H., Multivariate adaptive regression splines, The Annual of Statistics, 19, 1, 1-67, (1991) · Zbl 0765.62064
[19] Friedman, J. H.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 3, 432-441, (2008) · Zbl 1143.62076
[20] Gebert, J.; Laetsch, M.; Ming Poh Quek, E.; Weber, G. W., Analyzing and optimizing genetic network structure via path-finding, Journals of Computational Technologies, 9, 3-12, (2004) · Zbl 1057.92030
[21] Gebert, J.; Radde, N.; Weber, G. W., Modeling gene regulatory networks with piecewise linear differential equations, European Journal of Operational Research, 181, 1148-1165, (2007) · Zbl 1124.92008
[22] Gillespie, D. T., Stochastic simulations of coupled chemical reactions, The Journal of Physical Chemistry, 81, 2340-2361, (1977)
[23] Golightly, A.; Wilkinson, D. J., Bayesian inference for stochastic kinetic models using a diffusion approximation, Biometrics, 61, 3, 781-788, (2005) · Zbl 1079.62110
[24] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: data mining, Inference and prediction, (2009), Springer Verlag New York · Zbl 1273.62005
[25] Helms, V., Principles of computational cell biology, (2008), Wiley-VCH Verlag Weinheim, Germany
[26] Ivashkiv, L. B.; Donlin, L. T., Regulation of type i interferon responses, Nature Review Immunology, 14, 1, 36-49, (2014)
[27] Junker, B. H.; Schreiber, F., Analysis of Biological Networks, (2008), John Wiley and Sons, Inc Hoboken, New Jersey
[28] Liu, H.; Roeder, K.; Wasserman, L., Stability approach to regularization selection (stars) for high dimensional graphical models, Advances in Neural Information Processing Systems (NIPS), 1432-1440, (2010)
[29] Maiwald, T.; Schneider, A.; Busch, H.; Sahle, S.; Gretz, N.; Weiss, T. S., Combining theoretical analysis and experimental data generation reveals IRF9 as a crucial factor for accelerating interferon α-induced early antiviral signalling, The FEBS Journal, 277, 4741-4754, (2010)
[30] Mohammadi, A.; Abegaz, F.; Heuvel, E.; Wit, E. C., Bayesian modelling of dupuytren disease by using Gaussian copula graphical models, Journal of the Royal Statistical Society: Series C (Applied Statistics), 66, 3, 629-645, (2017)
[31] Mohammadi, A.; Wit, E. C., Bayesian structure learning in sparse Gaussian graphical model, Bayesian Analysis, 10, 1, 109-138, (2015) · Zbl 1335.62056
[32] Mohammadi, A., & Wit, E. C. (2017). BDgraph: Bayesian structure learning in graphical models using birth-death MCMC. R package version 3.40.; Mohammadi, A., & Wit, E. C. (2017). BDgraph: Bayesian structure learning in graphical models using birth-death MCMC. R package version 3.40.
[33] Moles, C. G.; Mendes, P.; Banga, J. R., Parameter estimation in biochemical pathways: A comparison of global optimization methods, Genome Research, 13, 2467-2474, (2003)
[34] Nelsen, R. B., An introduction to Copulas, (2006), Springer, Science-Business Media, Inc · Zbl 1152.62030
[35] Pilla, V. L.; Rosenberger, J. M.; Chen, V.; Engsuwan, N.; Siddappa, S., A multivariate adaptive regression splines cutting plane approach for solving a two-stage stochastic programming fleet assignment model, European Journal of Operational Research, 216, 162-171, (2012) · Zbl 1237.90133
[36] Purutçuoğlu, V.; Ayyildiz, E.; Wit, E., Comparison of two inference approaches in Gaussian graphical models, Turkish Journal of Biochemistry, 42, 2, 203-212, (2017)
[37] Rahmatallah, Y.; Emmert-Streib, F.; Glazko, G., Gene sets net correlations analysis (GSNCA): A multivariate differential coexpression test for gene sets, Bioinformatics, 30, 3, 360-368, (2014)
[38] Reinker, S.; Altman, R. M.; Timmer, J., Parameter estimation in stochastic biochemical reactions, IEEE Proceedings Systems Biology, 153, 4, 168-178, (2006)
[39] Ruppert, D.; Wand, M. P.; Carroll, R. J., Semiparametric regression, (2003), Cambridge University Press Cambridge, UK. · Zbl 1038.62042
[40] Sachs, K.; Perez, O.; Pe’er, D.; Lauffenburger, D.; Nolan, G., Causal protein-signaling networks derived from multiparameter single-cell data, Science, 308, 5721, 523-529, (2005)
[41] Savku, E.; Weber, G. W., A stochastic maximum principle for a Markov regime-switching jump-diffusion model with delay and an application to finance, Journal of Optimization Theory and Application, (2017)
[42] Shuai, K.; Liu, B., Regulation of JAK-STAT signalling in the immune system, Nature Reviews Immunology, 3, 900-911, (2003)
[43] Stranger, B. E.; Nica, A. C.; Forrest, M. S.; Dimas, A.; Bird, C. P.; Beazley, C., Population genomics of human gene expression, Nature genetics, 39, 10, 1217-1224, (2007)
[44] Taylan, P.; Weber, G. W.; Beck, A., New approaches to regression by generalized additive models and continuous optimization for modern applications in finance science and technology, Optimization, 56, 5-6, 1-24, (2007) · Zbl 1123.62055
[45] Taylan, P.; Weber, G. W.; Yerlikaya-Özkurt, F., A new approach to multivariate adaptive regression splines by using Tikhonov regularization and continuous optimization, TOP, 18, 2, 377-395, (2010) · Zbl 1208.41007
[46] Temocin, B. Z.; Weber, G. W., Optimal control of stochastic hybrid system with jumps: A numerical approximation, Journal of Computational and Applied Mathematics, 259, 443-451, (2014) · Zbl 1320.93093
[47] Trivedi, P.; Zimmer, D. M., Copula modeling: an introduction for practitioners, Foundations and Trends in Econometrics, 1, 1, 1-111, (2005) · Zbl 1195.91130
[48] Uğur, O.; Pickl, S. W.; Weber, G. W.; Wünschiers, R., An algorithm approach to analyse genetic networks and biological energy production: an introduction and contribution where OR meets biology, Optimization, 58, 1, 1-22, (2006) · Zbl 1158.92312
[49] Weber, G. W.; Defterli, O.; Gök, S. Z.A.; Kropat, E., Modeling, inference and optimization of regulatory networks based on time series data, European Journal of Operational Research, 211, 1-14, (2011) · Zbl 1221.93024
[50] Whittaker, J., Graphical models in applied multivariate statistics, (1990), John Wiley and Sons New York · Zbl 0732.62056
[51] Wilkinson, D. J. (2006). Stochastic modelling for systems biology. Boca Raton, FL, Taylor and Francis.; Wilkinson, D. J. (2006). Stochastic modelling for systems biology. Boca Raton, FL, Taylor and Francis. · Zbl 1099.92004
[52] Yamada, S.; Shiono, S.; Joo, A.; Yoshimura, A., Control mechanisim of JAK-STAT signal transduction pathway, FEBS Letters, 534, 190-196, (2003)
[53] Yerlikaya-Özkurt, F.; Batmaz, I.; Weber, G. W., A review and new contribution on conic multivariate adaptive regression splines (CMARS): A powerful tool for predictive data mining, (Zilberman, I.; Pinto, A., Modeling, dynamics, optimization and bioeconomics, vol. 37, (2014), Springer International Publishing Switzerland), 695-722
[54] Zhao, T.; Luin, H.; Simon, N., The huge package for high-dimensional undirected graph estimation in R, Journal of Machine Learning Research, 13, 1059-1062, (2012) · Zbl 1283.68311
[55] Zomorrodi, A. R.; Maranas, C. D., Coarse-grained optimization-driven design and piecewise linear modeling of synthetic genetic circuits, European Journal of Operational Research, 237, 665-676, (2014) · Zbl 1304.90135
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.