PomBase December 2014: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 12: Line 12:
  * GO:0005575 - cellular_component 4894/5052
  * GO:0005575 - cellular_component 4894/5052
  * GO:0008150 - biological_process 4316/5052
  * GO:0008150 - biological_process 4316/5052
 
* Only 89 proteins have ND annotation for all 3 aspects
Only 89 proteins have ND annotation for all 3 aspects


PomBase  currently provide  39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code,  the IEA annotation is filtered reducing the number of IEAs from *** to 5342.
PomBase  currently provide  39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code,  the IEA annotation is filtered reducing the number of IEAs from *** to 5342.

Revision as of 08:23, 11 December 2014

PomBase Summary, 2013/2014

Staff

Three PomBase curation staff (V. Wood, M Harris, A Lock) are working on GO-related tasks, none funded by the GO grant.

Annotation Progress

Of the 5052 protein coding genes, none-root node annotations are assigned for:

* GO:0003674 - molecular_function 3682/5052
* GO:0005575 - cellular_component 4894/5052
* GO:0008150 - biological_process 4316/5052
* Only 89 proteins have ND annotation for all 3 aspects

PomBase currently provide 39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code, the IEA annotation is filtered reducing the number of IEAs from *** to 5342.

ncRNAs are fully annotated using available literature (other than rRNA, tRNA, snRNA and snoRNA, most have no functional information).

The fission yeast literature corpus comprises ~5000 curatable publications, of which 1808 have been fully curated.

Methods and strategies for annotation (please note % effort on literature curation vs. computational annotation methods)

a. Literature curation: > 95%

b. Computational annotation strategies: < 5% (for the 2911 currently unpublished gene products)

c. Priorities for annotation

At present, PomBase curators are mainly curating historical papers. All new papers are assigned to the authors for first pass curation (currently 234 publications have been community curated).

The highest priority is to capture annotations where new information is available for an Uncharacterized gene product. These papers are identified during the literature triage process.

We are moving towards process- and gene-based annotation for efficiency. Effort is currently focussed on cell cycle, cytokinesis, chromosome segregation, DNA metabolism, chromatin modification and sporulation. We hope to be able to refine annotations for the well-studied genes so that they can be used to automatically generate LEGO diagrams.

PomBase curators are routinely creating terms via the new TermGenie interface to speed up the process of annotation.

PomBase curators are capturing more specificity for annotations by adding substrate, targets etc in the Annotation Extension column (aka col-16) using the protein2GO interface. ADD NUMBERS

PomBase curators are up to date with the GAF checks on Jenkins and the protein2GO error checks

PomBase compact GO view : New Feature

Presentations and Publications

Other Highlights

  • PomBase curators participate in Annotation conference calls