WormBase December 2013

From GO Wiki
Jump to navigation Jump to search

Overview:

Staff:

[please include FTEs working on GOC tasks designating as well how many FTEs funding by GOC NIHGRI grant]

Paul Sternberg, PI, WormBase, GO

Juancarlos Chan, Developer, WormBase

James Done, Developer, Textpresso

Ranjana Kishore, Curator

Yuling Li, Developer, Textpresso

Hans Michael Mueller, PI, Textpresso

Daniela Raciti, Curator

Kimberly Van Auken, Curator


* Funded entirely or partially by GO

Annotation Progress

WormBase GO Annotation Statistics as of December 2013

Table 1: Number of Genes Annotated

Type of Annotation Genes Annotated, Dec 2013 % Change from Dec 2012 Number of Unique GO Terms % Change from 2012 Total Number of Annotations % Change from Dec 2012
Manual Annotation 2,999 +15.3 14088 +5.2%
Phenotype2GO Mappings
IEA/Electronic
Total

Methods and strategies for annotation

Literature curation:

Curation of the primary literature continues to be the major focus of our manual annotation efforts.

Semi-automated curation using the Textpresso information retrieval system

We also routinely employ the Textpresso information retrieval system for semi-automated curation of GO Cellular Component and Molecular Function annotations.

Computational annotation strategies:

Our computational annotation strategies include mapping genes to GO terms using InterPro domains and mapping genes to Biological Process terms based upon parallel annotations to the Worm Phenotype Ontology (Phenotype2GO). These methods are performed automatically as part of the WormBase database build.

Note that during the past year, we stopped using an automated pipeline that mapped genes to GO:0016021, integral to plasma membrane, based on the results of a transmembrane prediction algorithm, TMHMM, as these IEA annotations had no external database identifier for the With/From column and therefore were not consistent with GO annotation practices.

Curation strategies

Priorities for annotation

Selection of genes for annotation is guided by several criteria:

  • Publication of newly characterized genes
  • C. elegans genes orthologous to human disease genes
  • Genes identified in the Textpresso-based curation pipelines
  • Re-annotation of genes associated with now obsolete GO terms or new ontology developments
  • Annotation of gene sets involved in specific biological processes as part of a pilot project at WormBase to coordinate topic-based curation across all data types.
    • The first topic annotated in this manner was the endoplasmic reticulum unfolded protein response.

Presentations and Publications

a. Papers with substantial GO content

  • Balakrishnan R, Harris MA, Huntley R, Van Auken K, Cherry JM. A guide to best practices for Gene Ontology (GO) manual annotation. Database (Oxford). 2013 Jul 9;2013:bat054. doi: 10.1093/database/bat054. PMID:23842463

b. Presentations including Talks and Tutorials and Teaching

c. Poster presentations

Other Highlights:

A. Migration to UniProt's Protein2GO Tool

  • As part of the migration to a common annotation framework, WormBase completed its round-trip data migration from UniProt's Protein2GO annotation tool to the WormBase database.

B. Natural Language Processing Tools for GO Curation

  • WormBase completed development of a new Textpresso for Cellular Component Curation (CCC) tool that includes new features such as autocompletion of GO terms, mapping of gene names and synonyms in text to MOD and UniProtKB Ids, and enhanced search capabilities of sentence source files and annotations. Most importantly, the new CCC tool and Protein2GO are now fully integrated: annotations made with the CCC tool are automatically sent to Protein2GO via web services.
  • The Protein2GO tool now contains a Literature Search link that allows curators to perform keyword searches on nine different Textpresso corpi from within Protein2GO.

B. Ontology Development Contributions:

  • WB contributions to ontology development:
    • Cellular Component
      • lysosome-related organelle
      • gut granule
      • gut granule lumen
      • gut granule membrane
      • amino acid transport complex
    • Biological Process
      • receptor localization to nonmotile primary cilium
      • stress response to copper ion
      • stress response to cadmium ion
      • positive regulation of transcription from RNA polymerase II promoter in response to reactive oxygen species
      • positive regulation of transcription from RNA polymerase II promoter in response to superoxide
      • L-lysine tranport
      • L-arginine transport
      • L-histidine transport
      • dense core granule transport
      • early endosome to recycling endosome transport



C. Annotation Outreach and User Advocacy Efforts:

  • Ranjana Kishore - AmiGO2 working group
  • Kimberly Van Auken continues to serve on the GO-help rota.


D. Other Highlights:

  • PAINT?
  • BioCreative?

Back to 2013 Progress Reports