UniProt: the universal protein knowledgebase
- PMID: 27899622
- PMCID: PMC5210571
- DOI: 10.1093/nar/gkw1099
UniProt: the universal protein knowledgebase
Erratum in
-
UniProt: the universal protein knowledgebase.Nucleic Acids Res. 2018 Mar 16;46(5):2699. doi: 10.1093/nar/gky092. Nucleic Acids Res. 2018. PMID: 29425356 Free PMC article. No abstract available.
Abstract
The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/). UniProt resources can be accessed via the website at http://www.uniprot.org/.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures
Similar articles
-
UniProt: the universal protein knowledgebase in 2021.Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100. Nucleic Acids Res. 2021. PMID: 33237286 Free PMC article.
-
UniProt: the Universal Protein knowledgebase.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D115-9. doi: 10.1093/nar/gkh131. Nucleic Acids Res. 2004. PMID: 14681372 Free PMC article.
-
UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View.Methods Mol Biol. 2016;1374:23-54. doi: 10.1007/978-1-4939-3167-5_2. Methods Mol Biol. 2016. PMID: 26519399
-
In silico characterization of proteins: UniProt, InterPro and Integr8.Mol Biotechnol. 2008 Feb;38(2):165-77. doi: 10.1007/s12033-007-9003-x. Epub 2007 Oct 4. Mol Biotechnol. 2008. PMID: 18219596 Review.
-
Bioinformatics Tools for Proteomics Data Interpretation.Adv Exp Med Biol. 2016;919:281-341. doi: 10.1007/978-3-319-41448-5_16. Adv Exp Med Biol. 2016. PMID: 27975225 Review.
Cited by
-
Integrative analysis of the efficacy and pharmacological mechanism of Xuefu Zhuyu decoction in idiopathic pulmonary fibrosis via evidence-based medicine, bioinformatics, and experimental verification.Heliyon. 2024 Sep 20;10(19):e38122. doi: 10.1016/j.heliyon.2024.e38122. eCollection 2024 Oct 15. Heliyon. 2024. PMID: 39416822 Free PMC article.
-
In Silico Modeling of Fabry Disease Pathophysiology for the Identification of Early Cellular Damage Biomarker Candidates.Int J Mol Sci. 2024 Sep 25;25(19):10329. doi: 10.3390/ijms251910329. Int J Mol Sci. 2024. PMID: 39408658 Free PMC article.
-
Large-scale annotation of biochemically relevant pockets and tunnels in cognate enzyme-ligand complexes.J Cheminform. 2024 Oct 15;16(1):114. doi: 10.1186/s13321-024-00907-z. J Cheminform. 2024. PMID: 39407342 Free PMC article.
-
A diel multi-tissue genome-scale metabolic model of Vitis vinifera.PLoS Comput Biol. 2024 Oct 10;20(10):e1012506. doi: 10.1371/journal.pcbi.1012506. eCollection 2024 Oct. PLoS Comput Biol. 2024. PMID: 39388487 Free PMC article.
-
Chromosome genome assembly and annotation of Adzuki Bean (Vigna angularis).Sci Data. 2024 Oct 2;11(1):1074. doi: 10.1038/s41597-024-03911-y. Sci Data. 2024. PMID: 39358398 Free PMC article.
References
-
- Suzek B.E., Huang H., McGarvey P., Mazumder R., Wu C.H.. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics. 2007; 23:1282–1288. - PubMed
-
- Leinonen R., Diez F.G., Binns D., Fleischmann W., Lopez R., Apweiler R.. UniProt archive. Bioinformatics. 2004; 20:3236–3237. - PubMed
-
- Giraldo-Calderon G.I., Emrich S.J., MacCallum R.M., Maslen G., Dialynas E., Topalis P., Ho N., Gesing S., VectorBase C., Madey G. et al. . VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic Acids Res. 2015; 43:D707–D713. - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources