UniProt-GOA Dec 5-March 3

From GO Wiki
Revision as of 06:45, 4 March 2014 by Huntley (talk | contribs)

Jump to: navigation, search


The UniProt GO Annotation project (UniProt-GOA) at the European Bioinformatics Institute has been a member of the GO Consortium since 2001. All UniProt curators are actively involved in curating UniProtKB entries with Gene Ontology terms during the UniProt literature curation process, providing both high-quality manual GO annotations in addition to their contributions to electronic GO annotation pipelines. The multi-species nature of UniProtKB means that the GO Annotation project is able to assist in the GO curation of proteins from over 450,000 taxonomic groups.

The core UniProt-GOA project staff are primarily responsible for supplying the GO Consortium with manual and electronic GO annotations to the human proteome. UniProt-GOA staff not only manually annotate, but coordinate and check the integration of GO annotations from other curation efforts at the EBI (including from InterPro, IntAct and Reactome). The UniProt-GOA dataset is supplemented with manual annotations from 40 annotating groups, including all members of the GO Consortium, as well as a number of external groups which produce relevant functional data. Nine electronic annotation pipelines are incorporated into the UniProt-GOA dataset, which provide the vast majority of annotations for non-model organism species. UniProt-GOA is therefore able to consolidate multiple sources of specialised knowledge, ensuring the UniProt-GOA resource remains a key up-to-date reference for a large number of research communities.


Claire O'Donovan

Maria Martin

Rachael Huntley*

Prudence Mutowo-Muellenet

Tony Sawford*

Aleksandra Shypitsyna

UniProt contributors (EBI, Hinxton, UK; SIB, Geneva, Switzerland; and PIR, Washington DC): Ioannis Xenarios, Lydie Bougueleret

Ghislaine Argoud-Puy, Andrea Auchinchloss, Kristian Axelsen, Marie-Claude Blatter, Emmanuel Boutet, Lionel Breuza, Alan Bridge, Gayatri Chavali, Elena Cibrian-Uhalte, Elizabeth Coudert, Isabelle Cusin, Paula Duek Roggli, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Arnaud Gos, Nadine Gruaz-Gumowski, Reija Hieta, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Guillaume Keller, Kati Laiho, Duncan Legge, Philippe Lemercier, Damien Lieberherr, Michele Magrane, Patrick Masson, Ivo Pedruzzi, Klemens Pichler, Diego Poggioli, Sylvain Poux, Catherine Rivoire, Bernd Roechert, Michel Schneider, Andre Stutz, Shyamala Sundaram, Michael Tognolli

* Funded entirely or partially by GO.

Annotation progress

Between 5 December 2013 and 3 March 2014, the UniProt-GOA project provided the GO Consortium with 3 annotation file releases, including non-redundant sets of GO annotations to 13 specific proteomes, as well as data releases for annotations of all proteins in UniProtKB.

UniProt incorporates manual annotations from other GO Consortium members and affiliates and displays these annotations in the relevant UniProtKB entries. Currently, the UniProt-GO Annotation project provides GO annotations for 65% of UniProt entries. Altogether, UniProt-GOA now provides over 246 million GO annotations for almost 35 million proteins in over 452,000 different taxonomic groups. UniProt-GOA provides 269,426 annotations for the 20,730 proteins in the human reference proteome.

Methods and strategies for annotation

a. Literature curation

In May 2013 we initiated the annotation of a list of approximately 400 human proteins that are targets of the Critical Assessment of Functional Annotation (CAFA) competition. We are curating the primary functions and processes of these proteins in order to populate these targets with functional annotations, which will assist in the assessment of the CAFA competition.

We continue to annotate proteins that are experimentally determined to be located in the extracellular vesicular exosome.

We have also been involved in recuration of annotations for necroptosis following ontology development in this area. This involved reviewing annotations and re-reading publications to determine if a more granular term could be used instead.

b. Computational annotation strategies

UniProt-GOA provides IEA annotations from the following methods:

  1. UniProt Keyword 2GO (SPKW2GO)1,2
  2. UniProt Subcellular Locations2GO (SPSL2GO)1,2
  3. Unipathway2GO1,2
  4. HAMAP2GO1,2
  5. InterPro2GO
  6. Ensembl Compara (vertebrates)
  7. Ensembl Genomes Compara (plants, fungi)


1: mapping tables created and maintained by UniProt

2: electronic annotations generated by UniProt

UniProt curators supply information to entries that is subsequently used in electronic GO annotation pipelines such as UniProtKB keywords2GO, UniProtKB subcellular location2GO and HAMAP2GO. Altogether, automatic annotation pipelines provide almost 245 million annotations to almost 35 million proteins.

c. Priorities for annotation

1. Proteins associated with the exosome (Prudence, Aleksandra)

2. Proteins from the CAFA target list (all curators)

3. Requests from user community (all curators)

4. Proteins annotated during Swiss-Prot curation duties (all Swiss-Prot/UniProtKB curators at the EBI and SIB)

5. Annotation corrections based on quality control reports (all curators)

Presentations and Publications

a. Publications

b. Presentations including Talks, Tutorials and Teaching

c. Posters

Other highlights

A. Ontology development contributions

  • All curators continue to request new GO terms or updates to the ontology where necessary, using either Term Genie or the SourceForge tracker

B. Annotation outreach and user advocacy efforts

C. Other highlights