Porpoise: a new approach for accurate prediction of RNA pseudouridine sites
- PMID: 34226915
- PMCID: PMC8575008
- DOI: 10.1093/bib/bbab245
Porpoise: a new approach for accurate prediction of RNA pseudouridine sites
Abstract
Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
Keywords: RNA pseudouridine sit; ebioinformatics; machine learning; sequence analysis; stacking ensemble learning.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Figures
Similar articles
-
PseU-KeMRF: A Novel Method for Identifying RNA Pseudouridine Sites.IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1423-1435. doi: 10.1109/TCBB.2024.3389094. Epub 2024 Oct 9. IEEE/ACM Trans Comput Biol Bioinform. 2024. PMID: 38625768
-
Predicting Pseudouridine Sites with Porpoise.Methods Mol Biol. 2023;2624:139-151. doi: 10.1007/978-1-0716-2962-8_10. Methods Mol Biol. 2023. PMID: 36723814
-
A Feature Fusion Predictor for RNA Pseudouridine Sites with Particle Swarm Optimizer Based Feature Selection and Ensemble Learning Approach.Curr Issues Mol Biol. 2021 Nov 1;43(3):1844-1858. doi: 10.3390/cimb43030129. Curr Issues Mol Biol. 2021. PMID: 34889887 Free PMC article.
-
Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112. Brief Bioinform. 2020. PMID: 31714956 Review.
-
Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning.Int J Mol Sci. 2022 Dec 7;23(24):15490. doi: 10.3390/ijms232415490. Int J Mol Sci. 2022. PMID: 36555143 Free PMC article. Review.
Cited by
-
GP-HTNLoc: A graph prototype head-tail network-based model for multi-label subcellular localization prediction of ncRNAs.Comput Struct Biotechnol J. 2024 May 3;23:2034-2048. doi: 10.1016/j.csbj.2024.04.052. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38765609 Free PMC article.
-
Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.Brief Bioinform. 2024 Mar 27;25(3):bbae169. doi: 10.1093/bib/bbae169. Brief Bioinform. 2024. PMID: 38622357 Free PMC article.
-
DPI_CDF: druggable protein identifier using cascade deep forest.BMC Bioinformatics. 2024 Apr 5;25(1):145. doi: 10.1186/s12859-024-05744-3. BMC Bioinformatics. 2024. PMID: 38580921 Free PMC article.
-
Interpretable Multi-Scale Deep Learning for RNA Methylation Analysis across Multiple Species.Int J Mol Sci. 2024 Mar 1;25(5):2869. doi: 10.3390/ijms25052869. Int J Mol Sci. 2024. PMID: 38474116 Free PMC article.
-
Simultaneous nanopore profiling of mRNA m6A and pseudouridine reveals translation coordination.Nat Biotechnol. 2024 Feb 6:10.1038/s41587-024-02135-0. doi: 10.1038/s41587-024-02135-0. Online ahead of print. Nat Biotechnol. 2024. PMID: 38321115
References
-
- Charette M, Gray MW. Pseudouridine in RNA: what, where, how, and why. IUBMB Life 2000;49:341–52. - PubMed
-
- Davis DR, Veltri CA, Nielsen L. An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr. J Biomol Struct Dyn 1998;15:1121–32. - PubMed