GOA - September 2009 and March 2010

From GO Wiki
Jump to navigation Jump to search

Gene Ontology Annotation at UniProtKB, 2010

Report on the GOA team's activities between September 2009 and March 2010.


Rolf Apweiler

Claire O'Donovan

Emily Dimmer

Rachael Huntley

Yasmin Alam-Faruque

Daniel Barrell

David Binns

Tony Sawford

Swiss-Prot contributors (EBI, Hinxton, UK and SIB, Geneva, Switzerland)

Ioannis Xenarios, Lydie Bougueleret, Alan Bridge, Sylvain Poux, Serenella Ferro-Rojas

Ghislaine Argoud-Puy, Andrea Auchinchloss, Kristian Axelsen, Marie-Claude Blatter, Emmanuel Boutet, Silvia Braconi Quintaje, Lionel Breuza, Paul Browne, Wei Mun Chan, Elizabeth Coudert, Isabelle Cusin, Louise Daugherty, Paula Duek Roggli, Ruth Eberhardt, Anne Estreicher, Livia Famiglietti, Marc Feuermann, Rebecca Foulger, Arnaud Gos, Nadine Gruaz-Gumowski, Ursula Hinz, Chantal Hulo, Janet James, Silvia Jimenez, Florence Jungo, Guillaume Keller, Kati Laiho, Duncan Legge, Philippe Lemercier, Damien Lieberherr, Michele Magrane, Madelaine Moinat, Patrick Masson, Ivo Pedruzzi, Catherine Rivoire, Bernd Roechert, Michel Schneider, Eleanor Stanley, Andre Stutz, Shyamala Sundaram, Michael Tognolli

Annotation Progress

We continue to put emphasis on the annotation of those genes selected for the Reference Genome Project.

In addition, with the newly started kidney-centric annotation project, additional emphasis has been placed on certain genes associated with renal development and disease.

Currently the curators from the GOA and BHF-UCL projects have completely annotated 75.6% (673 out of 890) of supplied Reference Genome Targets.

Contributions from the UniProtKB curation group:

UniProt Sept Mar.png

  • Decreases in annotations sets have largely resulted from addition filters implemented in November to remove annotations that apply obsolete or secondary UniProtKB accessions.

Methods and strategies for annotation

  1. Literature curation:

Literature curation continues to be the major focus of our annotation efforts, with an emphasis on the use of experimental evidence codes.

  1. Computational annotation strategies:

GOA provides IEA annotations from the following methods:

  1. Swiss-Prot Keyword 2GO (SPKW2GO)1,2
  2. Swiss-Prot Subcellular Locations2GO (SPSL2GO) 1,2
  3. HAMAP2GO2
  4. InterPro2GO2
  5. EC2GO2
  6. Ensembl Compara


1: mapping tables created and maintained by the GOA group

2: electronic annotations generated by the GOA group, using UniProtKB.

  1. Priorities for annotation
  1. Genes assigned by Reference Genome Project (Rachael, Emily)
  2. Genes associated with renal processes (Yasmin)
  3. Requests from user community (all curators)
  4. Proteins annotated during Swiss-Prot curation duties (all Swiss-Prot curators at the EBI and SIB)

Presentations and Publications

a. Papers with substantial GO content

The Renal Gene Ontology Annotation Initiative, Yasmin Alam-Faruque, Emily C. Dimmer, Rachael P. Huntley, Claire O’Donovan, Peter Scambler and Rolf Apweiler Organogenesis 2010 6 (2)

The Gene Ontology in 2010: extensions and refinements.Gene Ontology Consortium. Nucleic Acids Res. 2010 Jan;38(Database issue):D331-5.

b. Presentations including Talks and Tutorials and Teaching

November 2010 Hands-on training at EBI - Joint EBI-Wellcome Trust Proteomics workshop 'Introduction to GO and GOA' (Emily; GO/GOA talk and tutorial)

12th November, 12th February GO annotation training of Swiss-Prot curators at the Swiss Institute of Bioinformatics, Geneva, Switzerland (Emily and Rachael)

4th March: GOA stand at the EBI Open Day (Emily, Rachael and Yasmin)

17th March: Perspectives in Stem Cell Proteomics Training Workshop, EBI Cambridge UK. (Emily GO/GOA talk and tutorial)

31st March: Plant Bioinformatics course, EBI Cambridge UK. (Rachael GO/GOA talk and tutorial)

Other Highlights

A. Ontology Development Contributions:

25th January; Kidney Development Ontology Content Meeting

This content meeting was attended by the GOA curators involved in annotation of renal development-specific genes across species: Yasmin Alam-Faruque (Renal GOA), Emily and Rachael (GOA), Rebecca Foulger (UniProt/Xenopus), Susan Tweedie (Flybase), Doug Howe (ZFIN), as well as members of the GO editorial team (Midori Harris, David Hill) and, finally, by international biomedical experts with a main research focus in renal development biology: (Dr. Duncan Davidson, Prof. Jamie Davies (GUDMAP Consortium, Edinburgh), Prof. Adrian Woolf (Nephro-Urology, ICH-UCL, London), Prof. Randall Thomas (Renal Physiome Project, CNRS France). The aim of this meeting was to ensure that discussions with the experts would lead to the correct new terms being created which accurately represented each stage of nephrogenesis, and that the agreed definitions applied were applicable across all species.

x terms have been generated as a direct result of this content meeting. A short report on this meeting is being prepared for publication.

B. Annotation Outreach and User Advocacy Efforts:

Tufts University; Human Fetal Development annotation collaboration. The UniProtKB-GOA group is providing annotation support to Heather Wick, a curator from Tufts University, who is working as a part of an NIH grant investigating proteins implicated in human fetal development (PI: Donna Slonim). Heather will use the UniProtKB-GOA protein2go curation tool and will have their manual annotations released via UniProtKB-GOA release pipelines into the UniProtKB and Human gene association files.

C. Other

Renal GO annotation initiative funded by Kidney Research UK. Requires Short list of activities carried out by Yasmin

Verification of mappings to UniProtKB accessions in GO Consortium gp2protein files

The GOA group continues to provide groups in the GO Consortium with checks of the UniProtKB accessions applied in gp2protein mapping files. Annotation groups receive an email to indicate where in their file a secondary or deleted UniProtKB has been used. This email also (where possible) indicates suitable replacement UniProtKB accessions. Such checks are run and results emailed to annotation groups on the first of each month.

Since the 30th of November 2009, gene association files released by the UniProtKB group have excluded annotation lines that apply secondary or obsolete UniProtKB accession numbers. This has caused 3,929 (4.5%) of manual annotations integrated from external annotation groups to have been removed from UniProtKB-GOA files.

QuickGO browser The browser has been radically developed over the last couple of months. Major changes have been carried out on the interface and the data/facilities offered by the tool.

'Changes to UniProtKB GOA gene association files'

18th December 2009: UniProtKB-GOA gene association files included manual annotations created by the EcoCyc and EcoWiki groups for the first time.

PDB gene association file Work has been carried out so that so that annotations can be provided to individual PDB chains from both manual and electronic annotation resources from InterPro and UniProtKB-GOA. The new PDB file will supply manual and electronic annotations to PDB chains where the chain matches 80% of the length of the UniProtKB accession. InterPro2GO annotations are supplied to all PDB chains based on matches between InterPro domains and PDB entries. This new association file will be released in April 2010. This new file will provides the primary source of GO annotations for PDB, and annotations will be displayed in the PDBe website.

Changes to column 1 contents It is intended that the contents of column 1 in UniProtKB-GOA gene association files will change at our next monthly release (April 2010). Currently column 1 displays the values 'UniProtKB/TrEMBL' or 'UniProtKB/Swiss-Prot' to indicate which section of UniProtKB an accession is a member (please see our readme for further information on our annotation format: http://www.ebi.ac.uk/GOA/goaHelp.html). However, these two different UniProtKB descriptions have caused processing issues for GO Consortium tools. Therefore we are intending that information on whether a protein is a member of the UniProtKB Swiss-Prot or TrEMBL sets will be provided to users in a tab-separated, supplementary gene product information file that will be released alongside UniProtKB-GOA gene association files, and column 1 will be changed to consistently display 'UniProtKB' for all UniProtKB accessions.

Changing to GAF2.0

Supplying a supplementary gene product information file