×

A CNN-CBAM-BiGRU model for protein function prediction. (English) Zbl 1542.92095

MSC:

92D20 Protein sequences, DNA sequences
68T07 Artificial neural networks and deep learning
Full Text: DOI

References:

[1] Bepler, T. and Berger, B. (2021). Learning the protein language: evolution, structure, and function. Cell Syst. 12: 654-669. doi:10.1016/j.cels.2021.05.017. · doi:10.1016/j.cels.2021.05.017
[2] Bonetta, R. and Valentino, G. (2020). Machine learning techniques for protein function prediction. Proteins 88: 397-413. doi:10.1002/prot.25832. · doi:10.1002/prot.25832
[3] Branden, C.I. and Tooze, J. (2012). Introduction to protein structure. Garland Sci. 1-414.
[4] Cai, Y., Wang, J., and Deng, L. (2020). SDN2GO: an integrated deep learning model for protein function prediction. Front. Bioeng. Biotechnol. 8: 391, doi:10.3389/fbioe.2020.00391. · doi:10.3389/fbioe.2020.00391
[5] Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing. Association for Computational Linguistics.
[6] Dallago, C., Mou, J., Johnston, K.E., Wittmann, B., Bhattacharya, N., Goldman, S., Madani, A., and Yang, K.K. (2021). FLIP: benchmark tasks in fitness landscape inference for proteins. Adv. Neural Inf. Process. Syst. 1.
[7] Dinler, O.B. and Aydin, N. (2020). An optimal feature parameter set based on gated recurrent unit recurrent neural networks for speech segment detection. Appl. Sci. 10: 1273. doi:10.3390/app10041273. · doi:10.3390/app10041273
[8] Elnaggar, A., Heinzinger, M., Dallago, C., Rihawi, G., Wang, Y., Jones, L., Gibbs, T., Feher, T., Angerer, C., Steinegger, M., et al.. (2021). ProtTrans: towards cracking the language of lifes code through self-supervised deep learning and high performance computing. IEEE Trans. Patern Anal. Mach. Intell. 44: 7112-7127. doi:10.1109/tpami.2021.3095381. · doi:10.1109/tpami.2021.3095381
[9] Fan, K., Guan, Y., and Zhang, Y. (2020). Graph2GO: a multi-modal attributed network embedding method for inferring protein functions. GigaScience 9: 1-11, doi:10.1093/gigascience/giaa081. · doi:10.1093/gigascience/giaa081
[10] Fang, W., Love, P.E., Luo, H., and Ding, L. (2020). Computer vision for behaviour-based safety in construction: a review and future directions. Adv. Eng. Inf. 43: 100980. doi:10.1016/j.aei.2019.100980. · doi:10.1016/j.aei.2019.100980
[11] Gers, F.A., Schmidhuber, J., and Cummins, F. (2000). Learning to forget: continual prediction with LSTM. Neural Comput. 2: 2451-2471. doi:10.1162/089976600300015015. · doi:10.1162/089976600300015015
[12] Giri, S.J., Dutta, P., Member, S., Halan, P., and Saha, S. (2020). MultiPredGO: deep multi-modal protein function prediction by amalgamating protein structure, sequence, and interaction information. IEEE J. Biomed. Health Inform. 25: 1832-1838.
[13] Gligorijevic, V., Renfrew, P.D., Kosciolek, T., Leman, J.K., Berenberg, D., Vatanen, T., Chandler, C., Taylor, B.C., Fisk, I.M., Vlamakis, H., et al.. (2021). Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12: 3168. doi:10.1038/s41467-021-23303-9. · doi:10.1038/s41467-021-23303-9
[14] Heinzinger, M., Elnaggar, A., Wang, Y., Dallago, C., Nechaev, D., Matthes, F., and Rost, B. (2019). Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinform. 20: 723. doi:10.1186/s12859-019-3220-8. · doi:10.1186/s12859-019-3220-8
[15] Hewamalage, H., Bergmeir, C., and Bandara, K. (2020). Recurrent neural networks for time series forecasting: current status and future directions. Int. J. Forecast. 37: 388-427. doi:10.1016/j.ijforecast.2020.06.008. · doi:10.1016/j.ijforecast.2020.06.008
[16] Huang, K., Fu, T., Glass, L.M., Zitnik, M., Xiao, C., and Sun, J. (2020). DeepPurpose: a deep learning library for drug-target interaction prediction. Bioinformatics 36: 5545-5547. doi:10.1093/bioinformatics/btaa1005. · doi:10.1093/bioinformatics/btaa1005
[17] Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., Daugherty, L., Duquenne, L., et al.. (2009). Interpro: the integrative protein signature database. Nucleic Acids Res. 37: D211-D215. doi:10.1093/nar/gkn785. · doi:10.1093/nar/gkn785
[18] Jagannatha, A.N. and Yu, H. (2016). Structured prediction models for RNN based sequence labeling in clinical text. In: Proceedings of the conference on empirical methods in natural language processing. Conference on empirical methods in natural language processing, Vol. 2016. NIH Public Access, p. 856.
[19] Jiang, Y., Oron, T.R., Clark, W.T., Bankapur, A.R., D’Andrea, D., Lepore, R., Funk, C.S., Kahanda, I., Verspoor, K.M., Ben-Hur, A., et al.. (2016). An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17: 184. doi:10.1186/s13059-016-1037-6. · doi:10.1186/s13059-016-1037-6
[20] Jones, S. and Thornton, J.M. (1996). Principles of protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 93: 13-20. doi:10.1073/pnas.93.1.13. · doi:10.1073/pnas.93.1.13
[21] Kabir, A. and Shehu, A. (2022). GOProFormer: a multi-modal transformer method for geneOntology protein function prediction. Biomolecules 12: 1709.
[22] Kaleel, M., Zheng, Y., Chen, J., Feng, X., Simpson, J.C., Pollastri, G., and Mooney, C. (2020). SCLpred-EMS: subcellular localization prediction of endomembrane system and secretory pathway proteins by deep N-to-1 convolutional neural networks. Bioinformatics 36: 3343-3349. doi:10.1093/bioinformatics/btaa156. · doi:10.1093/bioinformatics/btaa156
[23] Kingma, D.P. and Ba, J. (2014). Adam: a method for stochastic optimization, arXiv preprint arXiv:1412.6980.
[24] Kulmanov, M., Khan, M.A., and Hoehndorf, R. (2018). DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 34: 660-668. doi:10.1093/bioinformatics/btx624. · doi:10.1093/bioinformatics/btx624
[25] Kulmanov, M., Zhapa-Camacho, F., and Hoehndorf, R. (2021). DeepGOWeb: fast and accurate protein function prediction on the (semantic) web. Nucleic Acids Res. 49: W140-W146. doi:10.1093/nar/gkab373. · doi:10.1093/nar/gkab373
[26] Le Cun, B.B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., and Jackel, L.D. (1989). Handwritten digit recognition with a back-propagation network. In: Proceedings of the advances in neural information processing systems (NIPS), pp. 396-404.
[27] Li, Y., Wang, S., Tian, Q., and Ding, X. (2015). Feature representation for statistical-learning-based object detection: a review. Pattern Recognit. 48: 3542-3559. doi:10.1016/j.patcog.2015.04.018. · doi:10.1016/j.patcog.2015.04.018
[28] Lopes, A.T., de Aguiar, E., De Souza, A.F., and Oliveira-Santos, T. (2017). Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit. 61: 610-628. doi:10.1016/j.patcog.2016.07.026. · doi:10.1016/j.patcog.2016.07.026
[29] Ma, B., Li, X., Xia, Y., and Zhang, Y. (2020). Autonomous deep learning: a genetic DCNN designer for image classification. Neurocomputing 379: 152-161. doi:10.1016/j.neucom.2019.10.007. · doi:10.1016/j.neucom.2019.10.007
[30] Meier, J., Rao, R., Verkuil, R., Liu, J., Sercu, T., and Rives, A. (2021). Language models enable zero-shot prediction of the effects of mutations on protein function. Adv. Neural Inf. Process. Syst. 34: 29287-29303.
[31] Nogueira, K., Penatti, O.A., and dos Santos, J.A. (2017). Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recognit. 61: 539-556. doi:10.1016/j.patcog.2016.07.001. · doi:10.1016/j.patcog.2016.07.001
[32] Piovesan, D. and Tosatto, S.C.E. (2019). INGA 2.0: improving protein function prediction for the dark proteome. Nucleic Acids Res. 47: W373-W378. doi:10.1093/nar/gkz375. · doi:10.1093/nar/gkz375
[33] Qiu, X.-Y., Wu, H., and Shao, J. (2022). TALE-cmap: protein function prediction based on a TALE-based architecture and the structure information from contact map. Comput. Biol. Med. 149: 105938, doi:10.1016/j.compbiomed.2022.105938. · doi:10.1016/j.compbiomed.2022.105938
[34] Ranjan, A., Fahad, M.S., Fernandez-Baca, D., Deepak, A., and Tripathi, S. (2019). Deep robust framework for protein function prediction using variable-length protein sequences. IEEE/ACM Trans. Comput. Biol. Bioinf. 17: 1648-1659, doi:10.1109/tcbb.2019.2911609. · doi:10.1109/tcbb.2019.2911609
[35] Ranjan, A., Tiwari, A., and Deepak, A. (2023), A sub-sequence based approach to protein function prediction via multi-attention based multi-aspect network, Vol: 20, Issue: 1, pp. 94-105.
[36] Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C.L., Ma, J., et al.. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl. Acad. Sci. U. S. A. 118: 1-12, doi:10.1073/pnas.2016239118. · doi:10.1073/pnas.2016239118
[37] Sharma, L., Deepak, A., Ranjan, A., and Krishnasamy, G. (2023). A novel hybrid CNN and BiGRU-Attention based deep learning model for protein function prediction. Stat. Appl. Genet. Mol. Biol. 22: 20220057. doi:10.1515/sagmb-2022-0057. · Zbl 1530.92066 · doi:10.1515/sagmb-2022-0057
[38] Smaili, F.Z., Tian, S., Roy, A., Alazmi, M., Arold, S.T., Mukherjee, S., Hefty, P.S., Chen, W., and Gao, X. (2021). QAUST: protein function prediction using structure similarity, protein interaction, and functional motifs. Dev. Reprod. Biol. 19: 998-1011. doi:10.1016/j.gpb.2021.02.001. · doi:10.1016/j.gpb.2021.02.001
[39] Sønderby, S.K. and Winther, O. (2014). Protein secondary structure prediction with long short term memory networks, arXiv preprint arXiv:1412.7828.
[40] Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., HuertaCepas, J., Simonovic, M., Roth, A., Santos, A., Tsafou, K.P., et al.. (2015). String v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43: D447-D452. doi:10.1093/nar/gku1003. · doi:10.1093/nar/gku1003
[41] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst. 30: 6000-6010.
[42] Visin, F., Kastner, K., Courville, A., Bengio, Y., Matteucci, M., and Cho, K. (2015). Reseg: a recurrent neural network for object segmentation. In: Proceedings of the IEEE conference on computer Vision and pattern recognition (CVPR) workshops.
[43] Widiastuti, N.I. (2019). Convolution neural network for text mining and natural language processing. IOP Conf. Ser. Mater. Sci. Eng. 662: 052010. doi:10.1088/1757-899X/662/5/052010. · doi:10.1088/1757-899X/662/5/052010
[44] Woo, S., Park, J., Lee, J.Y., and So Kweon, I. (2018). CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision. ECCV, pp. 3-19.
[45] You, R., Yao, S., Xiong, Y., Huang, X., Sun, F., Mamitsuka, H., and Zhu, S. (2019). Netgo: improving large-scale protein function prediction with massive network information. Nucleic Acids Res. 47: W379-W387. doi:10.1093/nar/gkz388. · doi:10.1093/nar/gkz388
[46] Zhang, H., Fusong, J., Zhu, J., He, L., Shao, B., Zheng, N., and Liu, T.-Y. (2021). Co-evolution transformer for protein contact prediction. Adv. Neural Inf. Process. Syst. 34: 14252-14263.
[47] Zhou, Y., Zhang, Y., Lian, X., Li, F., Wang, C., Zhu, F., Qiu, Y., and Chen, Y. (2022). Therapeutic target database update 2022: facilitating drug discovery with enriched comparative data of targeted agents. Nucleic Acids Res. 50: D1398-D1407. doi:10.1093/nar/gkab953. · doi:10.1093/nar/gkab953
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.