×

A 3D graphical representation of protein sequences based on the Gray code. (English) Zbl 1397.92528

Summary: Based on the order of 6-bit binary Gray code, a cyclic order of 20 amino acids is introduced. A novel 3D graphical representation of protein sequences is proposed according to the CGR of DNA sequences. Furthermore, the mathematical descriptor is suggested to characterize the graphical representation curve. The efficiency of our approach can be illustrated by performing the comparison of similarities/dissimilarities among sequences of the ND5 proteins of nine different species. With the correlation and significance analysis, the comparisons of both our results and results of other graphical representation with the ClustalW’s results can show the utility of our approach.

MSC:

92D20 Protein sequences, DNA sequences

Software:

Clustal X; 2D-MH
Full Text: DOI

References:

[1] Bai, F.L.; Wang, T.M., On graphical and numerical representation of protein sequences, J. biomol. struct. dyn., 23, 537-545, (2006)
[2] Concu, R.; Dea-Ayuela, M.A.; Perez-Montoto, L.G.; Bolas-Fernández, F.; Prado-Prado, F.J.; Podda, G.; Uriarte, E.; Ubeira, F.M.; González-Díaz, H., Prediction of enzyme classes from 3D structure: a general model and examples of experimental-theoretic scoring of peptide mass fingerprints of leishmania proteins, J. proteome res., 8, 4372-4382, (2009)
[3] el Maaty, M.I.A.; Abo-Elkhier, M.M.; Abd Elwahaab, M.A., 3D graphical representation of protein sequences and their statistical characterization, Physica A, 389, 4668-4676, (2010)
[4] Feng, J.; Wang, T.M., Characterization of protein primary sequences based on partial ordering, J. theor. biol., 254, 752-755, (2008) · Zbl 1400.92392
[5] González-Díaz, H.; Pérez-Montoto, L.G.; Duardo-Sanchez, A.; Paniagua, E.; Vázquez-Prieto, S.; Vilas, R.; Dea-Ayuela, M.A.; Bolas-Fernández, F.; Munteanu, C.R.; Dorado, J.; Costas, J.; Ubeira, F.M., Generalized lattice graphs for 2D-visualization of biological information, J. theor. biol., 261, 136-147, (2009) · Zbl 1403.92091
[6] He, P.A., A new graphical representation of similarity/dissimilarity studies of protein sequences, SAR QSAR environ. res., 21, 571-580, (2010)
[7] He, P.A.; Zhang, Y.P.; Yao, Y.H.; Tang, Y.F.; Nan, X.Y., The graphical representation of protein sequences based on the physicochemical properties and its applications, J. comput. chem., 31, 2136-2142, (2010)
[8] He, P.A.; Li, X.F.; Yang, J.L.; Wang, J.; Novel, A., Descriptor for protein similarity analysis, MATCH commun. math. comput. chem., 65, 445-458, (2011)
[9] Jeffrey, H.J., Chaos game representation of gene structure, Nucleic acids res., 18, 2163-2170, (1990)
[10] Liao, B.; Liao, B.Y.; Sun, X.M.; Zeng, Q.G., A novel method for similarity analysis and protein sub-cellular localization prediction, Bioinformatics, 26, 2678-2683, (2010)
[11] Li, C.; Yu, X.Q.; Yang, L.; Zheng, X.Q.; Wang, Z.F., 3-D maps and coupling numbers for protein sequences, Physica A., 388, 1967-1972, (2009)
[12] Li, F.Q.; Huang, G.H.; Liao, B.; Liu, Z.B.; Curve, H.-L., A novel 2-D graphical representation of protein sequences, MATCH commun. math. comput. chem., 61, 519-532, (2009) · Zbl 1224.92004
[13] Larkin, M.A.; Blackshields, G.; Brown, N.P.; Chenna, R.; McGettigan, P.A.; McWilliam, H.; Valentin, F.; Wallace, I.M.; Wilm, A.; Lopez, R.; Thompson, J.D.; Gibson, T.J.; Higgins, D.G., Clustal W and clustal X version 2.0, Bioinformatics, 23, 2947-2948, (2007)
[14] Nandy, A.; Harle, M.; Basak, S.C., Mathematical descriptors of DNA sequences: development and applications, Arkivoc, 9, 211-238, (2006)
[15] Randic, M.; Zupan, J.; Balaban, A.T.; Vikic-Topic, D.; Plasvic, D., Graphical representation of proteins, Chem. rev., 111, 790-862, (2011)
[16] Randic, M., 2-D graphical representation of proteins based on virtual genetic code,, SAR QSAR environ. res., 15, 147-157, (2004)
[17] Randic, M.; Zupan, J.; Balaban, A.T., Unique graphical representation of protein sequences based on nucleotide triplet codons, Chem. phys. lett., 397, 247-252, (2004)
[18] Randic, M.; Butina, D.; Zupan, J., Novel 2-D graphical representation of proteins, Chem. phys. lett., 419, 528-532, (2006)
[19] Randic, M.; Balaban, A.T.; Novic, M.; Zaloznik, A.; Pisanski, T., A novel graphical representstion of proteins, Period. biolog., 107, 403-414, (2005)
[20] Randic, M.; Vracko, M.; Novic, M.; Plavsic, D., Spectral representation of reduced protein models, SAR QSAR environ. res., 20, 415-427, (2009)
[21] Randic, M.; Mehulic, K.; Vukicevic, D.; Pisanski, T.; Vikic-Topic, D.; Plavsic, D., Graphical representation of proteins as four-color maps and their numerical characterization, J. mol. graph. model., 27, 637-641, (2009)
[22] Randic, M.; Zupan, J.; Vikic-Topic, D., On representation of proteins by star-like graphs, J. mol. graph. model., 26, 290-305, (2007)
[23] Randic, M., 2-D graphical representation of proteins based on physico-chemical properties of amino acids, Chem. phys. lett., 440, 291-295, (2007)
[24] Suparata, I.; Van Zanten, A., A construction of gray codes inducing complete graphs, Discrete math., 308, 4124-4132, (2008) · Zbl 1144.05037
[25] Vinga, S.; Almeida, J., Alignment-free sequence comparison — a review, Bioinformatics, 19, 513-523, (2003)
[26] Wen, J.; Zhang, Y.Y., A 2D graphical representation of protein sequence and its numerical characterization, Chem. phys. lett., 476, 281-286, (2009)
[27] Wu, Z.C.; Xiao, X.; Chou, K.C., 2D-MH: a web-server for generating graphic representation of protein sequences based on the physicochemical properties of their constituent amino acids, J. theor. biol., 267, 29-34, (2010) · Zbl 1410.92089
[28] Xi, L.; Liao, B.; Zeng, Q.G.; Luo, J.W., Protein functional class prediction using global encoding of amino acid sequence, J. theor. biol., 261, 290-293, (2009) · Zbl 1403.92212
[29] Yao, Y.H.; Dai, Q.; Li, L.; Nan, X.Y.; He, P.A.; Zhang, Y.Z., Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation, J. comput. chem., 31, 1045-1052, (2010)
[30] Yao, Y.H.; Dai, Q.; Li, C.; He, P.A.; Nan, X.Y.; Zhang, Y.Z., Analysis of similarity/dissimilarity of protein sequences, Proteins, 73, 864-871, (2008)
[31] Yau, S.S.T.; Yu, C.L.; He, R., A protein map and its application, DNA cell biol., 27, 241-250, (2008)
[32] Zhang, L.; Liao, B.; Li, D.C.; Zhu, W., A novel representation for apoptosis protein subcellular localization prediction using support vector machine, J. theor. biol., 259, 361-365, (2009) · Zbl 1402.92163
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.