×

Discovering gene association networks by multi-objective evolutionary quantitative association rules. (English) Zbl 1311.68140

Summary: In the last decade, the interest in microarray technology has exponentially increased due to its ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene association networks from gene expression profiles is a relevant task and several statistical techniques have been proposed to build them. The problem lies in the process to discover which genes are more relevant and to identify the direct regulatory relationships among them. We developed a multi-objective evolutionary algorithm for mining quantitative association rules to deal with this problem. We applied our methodology named GarNet to a well-known microarray data of yeast cell cycle. The performance analysis of GarNet was organized in three steps similarly to the study performed by Gallo et al. GarNet outperformed the benchmark methods in most cases in terms of quality metrics of the networks, such as accuracy and precision, which were measured using YeastNet database as true network. Furthermore, the results were consistent with previous biological knowledge.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
92D10 Genetics and epigenetics
Full Text: DOI

References:

[1] Brown, P.; Botstein, D., Exploring the new world of the genome with DNA microarrays, Nat. Genet., 21, 33-37 (1999)
[2] Bansal, M.; Belcastro, V.; Ambesi-Impiombato, A.; di Bernardo, D., How to infer gene networks from expression profiles, Mol. Syst. Biol., 3 (2007)
[3] Nepomuceno-Chamorro, I. A.; Aguilar-Ruiz, J. S.; Riquelme, J. C., Inferring gene regression networks with model trees, BMC Bioinformatics, 11, 517-525 (2010)
[4] del Jesús, M. J.; Gámez, J. A.; González, P.; Puerta, J. M., On the discovery of association rules by means of evolutionary algorithms, Wiley Interdiscip. Rev.: Data Min. Knowl. Discov., 1, 397-415 (2011)
[5] Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T., A fast and elitist multiobjective genetic algorithm: Nsga-ii, IEEE Trans. Evol. Comput., 6, 182-197 (2002)
[6] Zitzler, E.; Laumanns, M.; Thiele, L., Spea2: improving the strength Pareto evolutionary algorithm, EUROGEN, 3242, 95-100 (2001)
[7] Gacto, M. J.; Alcalá, R.; Herrera, F., Adaptation and application of multi-objective evolutionary algorithms for rule reduction and parameter tuning of fuzzy rule-based systems, Soft Comput., 13, 419-436 (2009)
[8] Du, F.; Rao, N.; Guo, J.; Yuan, Z.; Wang, R., Mining gene network by combined association rules and genetic algorithm, (Liao, W.; Fang, Y.; Zhu, C.; Au, O. C., Proceedings of the Int. Conf. on Communications, Circuits and Systems (2009), IEEE Computer Society: IEEE Computer Society Washington, DC)
[9] Soinov, L.; Krestyaninova, M.; Brazma, A., Towards reconstruction of gene networks from expression data by supervised learning, Genome Biol., 4, R6 (2003)
[10] Bulashevska, S.; Eils, R., Inferring genetic regulatory logic from expression data, Bioinformatics, 21, 2706-2713 (2005)
[11] Ponzoni, I.; Azuaje, F. A.; Juan Glass, D., Inferring adaptive regulation thresholds and association rules from gene expression data through combinatorial optimization learning, IEEE/ACM Trans. Comput. Biol. Bioinform., 4, 624-634 (2007)
[12] Gallo, C. A.; Carballido, J. A.; Ponzoni, I., Discovering time-lagged rules from microarray data using gene profile classifiers, BMC Bioinformatics, 12, 123-131 (2011)
[13] Spellman, P.; Sherlock, G.; Zhang, M.; Iyer, V.; Anders, K.; Eisen, M.; Brown, P.; Botstein, D.; Futcher, B., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization, Mol. Biol. Cell., 9, 3273-3297 (1998)
[16] Zhou, X.; Kao, M.; Wong, W., From the cover: transitive functional annotation by shortest-path analysis of gene expression data, Proc. Natl. Acad. Sci., 99, 12783-12788 (2002)
[17] Lee, H. K.; Hsu, A. K.; Sajdak, J.; Qin, J.; Pavlidis, P., Coexpression analysis of human genes across many microarray data sets, Genome Res., 14, 1085-1094 (2004)
[18] Wolfe, C.; Kohane, I.; Butte, A., Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks, BMC Bioinformatics, 6, 227 (2005)
[19] Obayashi, T.; Kinoshita, K., Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression, DNA Res., 16, 249-260 (2009)
[20] de la Fuente, A.; Bing, N.; Hoeschele, I.; Mendes, P., Discovery of meaningful associations in genomic data using partial correlation coefficients, Bioinformatics, 20, 3565-3574 (2004)
[21] Markowetz, F.; Spang, R., Inferring cellular networks-a review, BMC Bioinformatics, 8, S5 (2007)
[22] Margolin, A.; Nemenman, I.; Basso, K.; Wiggins, C.; Stolovitzky, G.; Favera, R.; Califano, A., Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context, BMC Bioinformatics, 7, S7 (2006)
[23] Borgelt, C., A conditional independence algorithm for learning undirected graphical models, J. Comput. System Sci., 76, 21-33 (2010) · Zbl 1186.68356
[25] Kaya, M.; Alhajj, R., Utilizing genetic algorithms to optimize membership functions for fuzzy weighted association rules mining, Appl. Intell., 24, 7-152 (2006)
[26] Alatas, B.; Akin, E., An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules, Soft Comput., 10, 230-237 (2006)
[27] Han, J.; Kamber, M., Data Mining: Concepts and Techniques (2006), Morgan Kaufmann · Zbl 1445.68004
[29] Kuo, R. J.; Lin, S. Y.; Shih, C. W., Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan, Expert Syst. Appl., 33, 794-808 (2007)
[30] Mangat, V., A novel hybrid framework using evolutionary computing and swarm intelligence for rule mining in the medical domain, IJCA Proceedings on International Conference on Recent Advances and Future Trends in Information Technology (iRAFIT 2012). IJCA Proceedings on International Conference on Recent Advances and Future Trends in Information Technology (iRAFIT 2012), iRAFIT, 6, 7-13 (2012)
[31] Rameshkumar, K., Extracting association rules from HIV infected patientsʼ treatment dataset, Trends in Bioinformatics, 4, 35-46 (2011)
[32] Steinbrecher, M.; Kruse, R., Visualizing and fuzzy filtering for discovering temporal trajectories of association rules, J. Comput. System Sci., 76, 77-87 (2010) · Zbl 1186.68168
[33] Geng, L.; Hamilton, H. J., Interestingness measures for data mining: a survey, ACM Comput. Surv., 38, 9 (2006)
[37] Fukuda, T.; Morimoto, Y.; Morishita, S.; Tokuyama, T., Mining optimized association rules for numeric attributes, J. Comput. System Sci., 58, 1-12 (1999) · Zbl 0943.68050
[39] Alatas, B.; Akin, E., Rough particle swarm optimization and its applications in data mining, Soft Comput., 12, 1205-1218 (2008) · Zbl 1141.68551
[40] Yin, Y.; Zhong, Z.; Wang, Y., Mining quantitative association rules by interval clustering, J. Comput. Inf. Syst., 4, 609-616 (2008)
[41] Goldberg, E. D., Genetic Algorithms in Search, Optimization, and Machine Learning (1989), Addison-Wesley Publishing Company · Zbl 0721.68056
[42] Pachón Álvarez, V.; Mata Vázquez, J., An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization, Expert Syst. Appl., 39, 585-593 (2012)
[43] Yan, X.; Zhang, C.; Zhang, S., Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support, Expert Syst. Appl., 36, 3066-3076 (2009)
[44] Luna, J. M.; Romero, J. R.; Ventura, S., Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules, Knowl. Inf. Syst., 32, 53-76 (2012)
[45] Romero, C.; Zafra, A.; Luna, J. M.; Ventura, S., Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data, Expert Syst. (2012)
[46] Alcalá-Fdez, J.; Flugy-Pape, N.; Bonarini, A.; Herrera, F., Analysis of the effectiveness of the genetic algorithms based on extraction of association rules, Fund. Inform., 98, 1001-1014 (2010)
[47] Deb, K., Multi-Objective Optimization Using Evolutionary Algorithms (2001), John Wiley & Sons, Inc. · Zbl 0970.90091
[48] del Jesús, M. J.; Gámez, J. A.; González, P.; Puerta, J. M., On the discovery of association rules by means of evolutionary algorithms, Wiley Interdiscip. Rev.: Data Min. Knowl. Discov., 1, 397-415 (2011)
[49] Alatas, B.; Akin, E.; Karci, A., MODENAR: multi-objective differential evolution algorithm for mining numeric association rules, Appl. Soft Comput., 8, 646-656 (2008)
[50] Qodmanan, H. R.; Nasiri, M.; Minaei-Bidgoli, B., Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence, Expert Syst. Appl., 38, 288-298 (2011)
[51] Martínez-Ballesteros, M.; Martínez-Álvarez, F.; Troncoso, A.; Riquelme, J. C., An evolutionary algorithm to discover quantitative association rules in multidimensional time series, Soft Comput., 15, 2065-2084 (2011)
[52] Martínez-Ballesteros, M.; Martínez-Álvarez, F.; Troncoso, A.; Riquelme, J. C., Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution, Integr. Comput.-Aided Eng., 17, 227-242 (2010)
[53] Martínez-Ballesteros, M.; Salcedo-Sanz, S.; Riquelme, J. C.; Casanova-Mateo, C.; Camacho, J. L., Evolutionary association rules for total ozone content modeling from satellite observations, Chemom. Intell. Lab. Syst., 109, 217-227 (2011)
[54] Miller, B. L.; Goldberg, D., Genetic algorithms, tournament selection, and the effects of noise, Complex Systems, 9, 193-212 (1995)
[56] Lee, I.; Li, Z.; Marcotte, E. M., An improved, bias-reduced probabilistic functional gene network of bakerʼs yeast, Saccharomyces cerevisiae, PLoS ONE, 2, e988 (2007)
[57] Cho, R. J.; Campbell, M. J.; Winzeler, E. A.; Steinmetz, L.; Conway, A.; Wodicka, L.; Wolfsberg, T. G.; Gabrielian, A. E.; Landsman, D.; Lockhart, D. J.; Davis, R. W., A genome-wide transcriptional analysis of the mitotic cell cycle, Mol. Cell., 2, 65-73 (1998)
[59] Dwight, S. S.; Harris, M. A.; Dolinski, K.; Ball, C. A.; Binkley, G.; Christie, K. R.; Fisk, D. G.; Issel-Tarver, L. F.; Schroeder, M.; Sherlock, G.; Sethuraman, A.; Weng, S.; Botstein, D.; Cherry, J. M., Saccharomyces genome database (SGD) provides secondary gene annotation using the gene ontology (GO), Nucleic Acids Res., 30, 69-72 (2002)
[60] Lee, I.; Date, S. V.; Adai, A. T.; Marcotte, E. M., A probabilistic functional network of yeast genes, Science, 306, 1555-1558 (2004)
[61] Berriz, G.; King, O.; Bryant, B.; Sander, C.; Roth, F., Characterizing gene sets with FuncAssociate, Bioinformatics, 19, 2502-2504 (2003)
[62] Chen, K.; Csikasz-Nagy, A.; Gyorffy, B.; Val, J.; Novak, B.; Tyson, J., Kinetic analysis of a molecular model of the budding yeast cell cycle, Mol. Biol. Cell., 11, 369-391 (2000)
[63] Althoefer, H.; Schleiffer, A.; Wassmann, K.; Nordheim, A.; Ammerer, G., Mcm1 is required to coordinate G2-specific transcription in Saccharomyces cerevisiae, Mol. Cell. Biol., 15, 5917-5928 (1995)
[64] Loy, C. J.; Lydall, D.; Surana, U., Ndd1, a high-dosage suppressor of cdc28-1N, is essential for expression of a subset of late-S-phase-specific genes in Saccharomyces cerevisiae, Mol. Cell. Biol., 19, 3312-3327 (1999)
[65] Toyn, J.; Johnson, A.; Donovan, J.; Toone, W.; Johnston, L., The Swi5 transcription factor of Saccharomyces cerevisiae has a role in exit from mitosis through induction of the cdk-inhibitor Sic1 in telophase, Genetics, 145, 85-96 (1997)
[66] Epstein, C. B.; Cross, F. R., Clb5: a novel B cyclin from budding yeast with a role in S phase, Genes Dev., 6, 1695-1706 (1992)
[67] Levine, K.; Huang, K.; Cross, F. R., Saccharomyces cerevisiae G1 cyclins differ in their intrinsic functional specificities, Mol. Cell. Biol., 16, 6794-6803 (1996)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.