DictyBase December 2014

From GO Wiki
Revision as of 11:50, 11 December 2014 by Pfey03 (talk | contribs) (Created page with "Category:Reports = dictyBaseDecember , 2014 = = Staff: = PI: Rex Chisholm Annotators: Petra Fey, Robert Dodson Developers: Siddhartha Basu, David Jimenez-Morales All di...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

dictyBaseDecember , 2014

Staff:

PI: Rex Chisholm

Annotators: Petra Fey, Robert Dodson

Developers: Siddhartha Basu, David Jimenez-Morales

All dictyBase staff contributes to GO activities. This is currently a total of 3 FTE positions. Of these we receive sufficient funding from the GO grant to support 0.6 FTEs.

Annotation:

Gene Ontology annotation is integral to the curation process at dictyBase. Annotation of gene products to GO terms is done concurrently with curation of literature, strains, phenotypes, and general nomenclature. Both curators work to annotate gene products of the Dictyostelium genome.

At dictyBase we have used the Protein2GO tool from the EBI to annotate GO for Dictyostelium proteins since June 2012. Since February 2013, we have re-imported annotations monthly from GOA back into dictyBase, including electronic annotations. We then append our GO annotations to RNAs and send the GAF file to the GO consortium.

The use of Protein2GO continues to be of great advantage and enables us to react to changing annotation practices avoiding a direct impact on our database. For example, dictyBase curators regularly add annotation extensions in Protein2GO although they cannot yet be stored or displayed at dictyBase. We expect to be able to provide annotation extensions to our users in 2015.

Semi-automated annotation: We have worked for a while towards using Textpresso to suggest GO terms for annotation to cellular component terms (Van Auken et al., BMC Bioinformatics 2009, 10:228). We participated in the BioCreative Workshop Track III in 2012, for which we compared purely manually curated cell component annotations with Textpresso-assisted annotations. However, implementation at dictyBase was delayed because of the GO consortium's wise decision to connect Textpresso annotations directly to Protein2GO. We tested the updated Textpresso annotation tool and provided valuable feedback to Kimberly van Auken. We also provide a monthly updated GPI file containing all dictyBase protein names and identifiers to make the text mining a success. This December, we have started incorporating Textpresso into our curation workflow.

  • Petra is coordinating all dictyBase GO annotation issues with regard to Protein2GO with the EBI and attends the GO meetings. She also provided detailed information on experience with Protein2GO for the EBI case study, conducted by Neal Beagrie.
  • Robert and Petra are working with Kimberly Van Auken on the semi-automatic CC annotations using Textpresso.
  • Siddhartha is working on implementing the GPAD and GPI files for the pipelines between the EBI, dictyBase and the GO consortium.
  • David provides a GPI file to Textpresso, updated monthly.

Other dictyBase contributions to GO

Both dictyBase curators work to improve the GO with GO editors and other curators in the field, and contribute to discussions on the GO email list, in the bi-weekly annotation calls, and on Sourceforge. When necessary, they use TermGenie or Sourceforge to create new GO terms. In 2014, 14 new GO terms were added.

dictyBase Is currently working on a complete database and software overhaul. Once that is completed we will use GPAD/GPI files instead of GAF in our Protein2GO-dictyBase-GO consortium pipeline.

Annotation Progress

The total and non-IEA annotation numbers include annotations that are not from dictyBase, most from the PAINT project with a few from INTACT and SwissProt. At this point we cannot easily query them by source. However, the experimental annotations are nearly all from dictyBase and the indicated annotation extensions (column 16) have been exclusively annotated by dictyBase curators (Table 3).

Table 1: Number of Annotations

2012
2013
% Change
Total number of annotations 56475 57925 + 2.6%
Function 22166 21968 - 0.9%
Process 18600 19581 + 5.3%
Component 15709 16376 + 4.2%


Table 2: Number of non-IEA Annotations

2012
2013
% Change
Total number of annotations 18604 20367 + 9.5%
Function 4566 4961 + 8.7%
Process 8471 9248 + 9.2%
Component 5567 6158 + 10.6%


Table 3: Additional Numbers

2012
2013
% Change
Total EXP annotations
4756
5598
+ 18%
Annotation extensions
66
187
+ 183%