GOA December 2012: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 22: Line 22:
Ghislaine Argoud-Puy, Andrea Auchinchloss, Kristian Axelsen, Marie-Claude Blatter, Emmanuel Boutet, Lionel Breuza, Alan Bridge, Wei Mun Chan, Gayatri Chavali, Elizabeth Coudert, Isabelle Cusin, Paula Duek Roggli, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Arnaud Gos, Nadine Gruaz-Gumowski, Reija Hieta, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Guillaume Keller, Kati Laiho, Duncan Legge, Philippe Lemercier, Damien Lieberherr, Michele Magrane, Patrick Masson, Ivo Pedruzzi, Klemens Pichler, Diego Poggioli, Sylvain Poux, Catherine Rivoire, Bernd Roechert, Michel Schneider, Andre Stutz, Shyamala Sundaram, Michael Tognolli  
Ghislaine Argoud-Puy, Andrea Auchinchloss, Kristian Axelsen, Marie-Claude Blatter, Emmanuel Boutet, Lionel Breuza, Alan Bridge, Wei Mun Chan, Gayatri Chavali, Elizabeth Coudert, Isabelle Cusin, Paula Duek Roggli, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Arnaud Gos, Nadine Gruaz-Gumowski, Reija Hieta, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Guillaume Keller, Kati Laiho, Duncan Legge, Philippe Lemercier, Damien Lieberherr, Michele Magrane, Patrick Masson, Ivo Pedruzzi, Klemens Pichler, Diego Poggioli, Sylvain Poux, Catherine Rivoire, Bernd Roechert, Michel Schneider, Andre Stutz, Shyamala Sundaram, Michael Tognolli  


<nowiki>* Funded entirely or partially by GO  
<nowiki>* Funded entirely or partially by GO </nowiki>


(1) Left July 2012
(1) Left July 2012

Revision as of 08:08, 11 December 2012

Gene Ontology Annotation at UniProt Summary, December 2012

Overview

Staff:

Rolf Apweiler

Claire O'Donovan

Emily Dimmer(1)

Rachael Huntley*

Yasmin Alam-Faruque

Prudence Mutowo-Muellenet

Tony Sawford*

Swiss-Prot contributors (EBI, Hinxton, UK, SIB, Geneva, Switzerland and PIR, Washington DC): Ioannis Xenarios, Lydie Bougueleret

Ghislaine Argoud-Puy, Andrea Auchinchloss, Kristian Axelsen, Marie-Claude Blatter, Emmanuel Boutet, Lionel Breuza, Alan Bridge, Wei Mun Chan, Gayatri Chavali, Elizabeth Coudert, Isabelle Cusin, Paula Duek Roggli, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Arnaud Gos, Nadine Gruaz-Gumowski, Reija Hieta, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Guillaume Keller, Kati Laiho, Duncan Legge, Philippe Lemercier, Damien Lieberherr, Michele Magrane, Patrick Masson, Ivo Pedruzzi, Klemens Pichler, Diego Poggioli, Sylvain Poux, Catherine Rivoire, Bernd Roechert, Michel Schneider, Andre Stutz, Shyamala Sundaram, Michael Tognolli

* Funded entirely or partially by GO

(1) Left July 2012

Annotation Progress

Between January 2012 and November 2012, the UniProt-GOA project provided the GO Consortium with 11 annotation file releases, including non-redundant sets of GO annotations to 13 specific proteomes, as well as data releases for annotations of all proteins in UniProtKB.

UniProt incorporates manual annotations from other GO Consortium members and affiliates and displays these annotations in the relevant UniProtKB entries. Currently, the UniProt-GO Annotation project provides GO annotations for 68% of UniProt entries. Altogether, UniProt-GOA now provides almost 127 million GO annotations for almost 19 million proteins in over 370,000 different taxonomic groups. UniProt-GOA provides 354,486 annotations for the human proteome.


Methods and strategies for annotation

Literature curation:

The renal annotation project, funded by Kidney Research UK and under the direction of Yasmin Alam-Faruque, has been very successful. The project ended in April 2012 and resulted in the provision of 2,810 proteins with 43,858 annotations. As a result of the renal project over 600 new GO terms were created allowing curators to choose much more specific terms when annotating kidney function and process. A paper summarizing the project is in preparation.

During 2012, Prudence Mutowo-Muellenet completed a project to annotate all of the proteins in the human peroxisome. The project has enabled us to provide a list of 88 proteins that are experimentally determined as being located to the peroxisome. These proteins have been given full functional annotation using the available literature resulting in a total of 218 manual annotations for this set of proteins. 296 other proteins were also partly annotated during the process leading to a total of 1,589 annotations. A paper describing this project has been submitted for publication. Prudence has now started a similar project to annotate proteins present in the exosome.

Computational annotation strategies:

UniProt-GOA provides IEA annotations from the following methods:

  1. UniProt Keyword 2GO (SPKW2GO)1,2
  2. UniProt Subcellular Locations2GO (SPSL2GO)1,2
  3. Unipathway2GO1,2
  4. HAMAP2GO1,2
  5. InterPro2GO
  6. Ensembl Compara (vertebrates)
  7. Ensembl Genomes Compara (plants, fungi)


Key

1: mapping tables created and maintained by UniProt

2: electronic annotations generated by UniProt

UniProt curators supply information to entries that is subsequently used in electronic GO annotation pipelines such as UniProtKB keywords2GO, UniProtKB subcellular location2GO and HAMAP2GO. Altogether, automatic annotation pipelines provide 125 million annotations to almost 19 million proteins.

Two new automatic pipelines were incorporated by UniProt in 2012; UniPathway2GO (a collaboration between UniProt, INRIA (Rhone-Alpes) and Laboratoire d'Ecologie Alpine (Grenoble)), which provides GO annotations describing the metabolic pathways that proteins are involved was initiated in May 2012 and a pipeline that uses orthology data from Ensembl Compara to project GO annotations between fungal proteins was initiated by the Ensembl Genomes group and incorporated into the UniProt gene association file in July 2012.

UniProt-GOA now maintains an annotation blacklist, which contains a list of UniProtKB accessions and any GO identifiers that they should not be associated with. This is especially useful for suppressing incorrect annotations made by electronic methods that predict GO terms for groups of proteins that may not be correct for all members of that group, e.g. some electronic annotation sources have a cut-off such that if the annotation is correct for 95% of the proteins in the set, the GO annotation will be added to the whole set.

Priorities for annotation

1. Protein associated with the exosome (Prudence)

2. Requests from user community (all curators)

3. Proteins annotated during Swiss-Prot curation duties (all Swiss-Prot/UniProtKB curators at the EBI and SIB)

Presentations and Publications

a. Publications

The UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt) 2012 Nucleic Acids Res 40 (Database issue): D71-D75. PMCID: PMC3245120

Dimmer EC, Huntley RP, Alam-Faruque Y, Sawford T, O’Donovan C, Martin MJ et al. The UniProt-GO Annotation database in 2011. 2012 Nucleic Acids Res 40 (Database issue): D565-570. PMCID: PMC3245010

b. Presentations including Talks and Tutorials and Teaching

Huntley RP. Introduction to the Gene Ontology and GO annotation resources. 15 Jan 2012 Plant and Animal Genome Conference, San Diego, CA USA (Talk)

Mutowo-Muellenet P. UniProt-GOA activities update. 12 July 2012 Hinxton Scientific Forum, Wellcome Trust Genome Campus, Hinxton UK (Talk)

Mutowo-Muellenet P. Peroxisome annotation enrichment. 4 Sept 2012 EBI Ontology Workshop, Wellcome Trust Genome Campus, Hinxton UK (Talk)

c. Poster presentations

Alam-Faruque Y. The UniProt-GOA project. 2 Apr 2012 International Society for Biocuration Conference, Washington DC USA (Poster)

Mutowo-Muellenet P. The UniProt-GOA project. 8 Sept 2012 The first 10 years of UniProt ECCB12 Satellite Symposium (Poster)

Mutowo-Muellenet P. The UniProt-GOA project. 20 Nov 2012 EBI Open Day, Wellcome Trust Genome Campus, Hinxton UK (Poster)


Ontology Development Contributions:

Annotation Outreach and User Advocacy Efforts:

Other Highlights: