DictyBase December 2014
dictyBaseDecember , 2014
PI: Rex Chisholm
Annotators: Petra Fey, Robert Dodson
Developers: Siddhartha Basu, David Jimenez-Morales
All dictyBase staff contributes to GO activities. This is currently a total of 3 FTE positions. Of these we receive sufficient funding from the GO grant to support 0.6 FTEs.
Gene Ontology annotation is integral to the curation process at dictyBase. Annotation of gene products to GO terms is done concurrently with curation of literature, strains, phenotypes, and general nomenclature. Both curators work to annotate gene products of the Dictyostelium genome.
At dictyBase we have used the Protein2GO tool from the EBI to annotate GO for Dictyostelium proteins since June 2012. Since February 2013, we have re-imported annotations monthly from GOA back into dictyBase, including electronic annotations. We then append our GO annotations to RNAs and send the GAF file to the GO consortium.
The use of Protein2GO continues to be of great advantage and enables us to react to changing annotation practices avoiding a direct impact on our database. For example, dictyBase curators regularly add annotation extensions in Protein2GO although they cannot yet be stored or displayed at dictyBase. We expect to be able to provide annotation extensions to our users in 2015.
Semi-automated annotation: We have worked for a while towards using Textpresso to suggest GO terms for annotation to cellular component terms (Van Auken et al., BMC Bioinformatics 2009, 10:228). We participated in the BioCreative Workshop Track III in 2012, for which we compared purely manually curated cell component annotations with Textpresso-assisted annotations. However, implementation at dictyBase was delayed because of the GO consortium's wise decision to connect Textpresso annotations directly to Protein2GO. We tested the updated Textpresso annotation tool and provided valuable feedback to Kimberly van Auken. We also provide a monthly updated GPI file containing all dictyBase protein names and identifiers to make the text mining a success. This December, we have started incorporating Textpresso into our curation workflow.
- Petra is coordinating all dictyBase GO annotation issues with regard to Protein2GO with the EBI and attends the GO meetings. She also provided detailed information on experience with Protein2GO for the EBI case study, conducted by Neal Beagrie.
- Robert and Petra are working with Kimberly Van Auken on the semi-automatic CC annotations using Textpresso.
- Siddhartha is working on implementing the GPAD and GPI files for the pipelines between the EBI, dictyBase and the GO consortium.
- David provides a GPI file to Textpresso, updated monthly.
Other dictyBase contributions to GO
Both dictyBase curators work to improve the GO with GO editors and other curators in the field, and contribute to discussions on the GO email list, in the bi-weekly annotation calls, and on Sourceforge. When necessary, they use TermGenie or Sourceforge to create new GO terms. In 2014, 14 new GO terms were added.
dictyBase Is currently working on a complete database and software overhaul. Once that is completed we will use GPAD/GPI files instead of GAF in our Protein2GO-dictyBase-GO consortium pipeline.
The total and non-IEA annotation numbers include annotations that are not from dictyBase, most from the PAINT project with a few from INTACT and SwissProt. At this point we cannot easily query them by source. However, the experimental annotations are nearly all from dictyBase and the indicated annotation extensions (column 16) have been exclusively annotated by dictyBase curators (Table 3).
Table 1: Number of Annotations
|Total number of annotations||57925||62961||+ 8.7%|
Table 2: Number of non-IEA Annotations
|Total number of annotations||20367||24983||+ 22.6%|
Table 3: Additional Numbers
|Total EXP annotations||