PomBase December 2014

From GO Wiki
Jump to navigation Jump to search

PomBase Summary, 2013/2014

Staff

Three PomBase curation staff (V. Wood, M Harris, A Lock) are working on GO-related tasks, none funded by the GO grant.

Annotation Progress

Of the 5052 protein coding genes, none-root node annotations are assigned for:

* GO:0003674 - molecular_function 3682/5052
* GO:0005575 - cellular_component 4894/5052
* GO:0008150 - biological_process 4316/5052
* Only 89 proteins have ND annotation for all 3 aspects

PomBase currently provide 39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code, the IEA annotation is filtered reducing the number of IEAs from *** to 5342.

ncRNAs are fully annotated using available literature (other than rRNA, tRNA, snRNA and snoRNA, most have no functional information).

The fission yeast literature corpus comprises ~5000 curatable publications, of which 1808 have been fully curated.

2315 annotations have annotation extensions

Methods and strategies for annotation

a. Literature curation: > 95%

b. Computational annotation strategies: < 5% (for the 2911 currently unpublished gene products)

c. Priorities for annotation

At present, PomBase curators are mainly curating historical papers. All new papers are assigned to the authors for first pass curation (currently 234 publications have been community curated).

The highest priority is to capture annotations where new information is available for an Uncharacterized gene product. These papers are identified during the literature triage process.

We are moving towards process- and gene-based annotation for efficiency. Effort is currently focussed on cell cycle, cytokinesis, chromosome segregation, DNA metabolism, chromatin modification and sporulation. We hope to be able to refine annotations for the well-studied genes so that they can be used to automatically generate LEGO diagrams.

PomBase curators are routinely creating terms via the new TermGenie interface to speed up the process of annotation.

PomBase curators are capturing more specificity for annotations by adding substrate, targets etc in the Annotation Extension column (aka col-16) using the protein2GO interface. ADD NUMBERS

PomBase curators are up to date with the GAF checks on Jenkins and the protein2GO error checks

Presentations and Publications

a. Papers with substantial GO content


McDowall MD, Harris MA, Lock A, Rutherford K, Staines DM, Bähler J, Kersey PJ, Oliver SG, Wood V. PomBase 2015: updates to the fission yeast database. Nucleic Acids Res. 2014 Oct 31

Dikicioglu D, Wood V, Rutherford KM, McDowall MD, Oliver SG. Improving functional annotation for industrial microbes: a case study with Pichia pastoris. Trends Biotechnol. 2014 Aug;32(8):396-9.

Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon S, Dietze H, Dimmer EC, Foulger RE, Hill DP, Khodiyar VK, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall CJ. A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics. 2014 May 21;15(1):155.

Rutherford KM, Harris MA, Lock A, Oliver SG, Wood V. Canto: An online tool for community literature curation. Bioinformatics. 2014

Hayles J, Wood V, Jeffery L, Hoe K-L, Kim D-U, Park H-O, Salas-Pino S, Heichinger C, Nurse P. A genome-wide resource of cell cycle and cell shape genes of fission yeast. Open Biol 2013 3: 130053. (GO used for analysis)


b. Presentations including Talks and Tutorials and Teaching


April 2013

Workshop: Title: Gene Ontology Annotation Workshop University College London (Antonia Lock, for Ruth Lovering)


Aug 2013

Talk: Title: Fission Yeast as a Model for Systems Biology (Covering GO, QC using annotation intersections, annotation extensions, and cell cycle ontology content meeting) University of Trondheim, Norway (Valerie Wood)


Oct 2013

Lecture: Title: Biological Databases: Data Types and Connections Computational Biology MPhil students, University of Cambridge (Midori Harris)


Nov 2013

Curation Workshop: Title: Genome Annotation Cambridge University Part III Biochemistry (Valerie Wood, Midori Harris, Antonia Lock)

Lecture: Title: Databases and Genome Annotation (Semantic Systems Biology) Cambridge University Part II Systems Biology (Valerie Wood)


Dec 2013

Lecture and Practical: Title: An introduction to Data analysis and curation using PomBase Wellcome Trust Fission Yeast Workshop, Hinxton, Cambridge (Valerie Wood)

Talk/ Demo: Title: Evaluating the functional annotation status of K. pastoris and implementing Canto for community curation BIOLEDGE Consortium Meeting, Wolfson College, Cambridge (Valerie Wood)


April 2014

Demonstration/ presentation: Title: Canto: A tool for ontology based curation by expert curators and the scientific community Community curation workshop, Biocurator 2014, Toronto


July 2014

Lecture and Workshop: Title: Using PomBase: Exploring Fission Yeast Data Online EMBO course “Molecular genetics with fission yeast”, Institut Pasteur, Paris (Midori Harris)


May 2014

Workshop: Title: Gene Ontology Annotation Workshop University College London (Antonia Lock, for Ruth Lovering)


Oct 2014

Talk/Demo: Title: Canto Curation Tool WormBase Scientific Advisory Board, Hinxton Hall, Cambridge (Valerie Wood)


Nov 2014

Curation Workshop: Title: Genome Annotation Cambridge University Part III Biochemistry (Valerie Wood, Midori Harris, Antonia Lock)

Lecture: Title: Databases and Genome Annotation (Semantic Systems Biology) Cambridge University Part II Systems Biology (Valerie Wood)


c. Poster presentations


April 2013

Biocurator 2013, Cambridge, UK

  • i) Using biological process co-annotation for ontology and annotation quality control
  • ii) A configurable tool for community literature curation

June 2013

EMBO Conference on Fission Yeast: Pombe 2013, UCL, London, UK

  • i) A configurable tool for community literature curation
  • ii) PomBase.org
  • iii) Using Gene Ontology biological process co-annotation for ontology and annotation quality control

April 2014

British Yeast Group, Exeter, U.K. 2014

  • i) PomBase, the scientific resource for fission yeast

Biocurator 2014, Toronto, Canada

  • i) Using Gene Ontology biological process co-annotation for ontology and annotation quality control
  • ii) PomBase: The Model Organism Database for fission yeast


Ontology Development Contributions

Cell Cycle Overhaul

Co-organized the Cell Cycle ontology meeting at Hinxton (Feb 2013). This overhaul generated 181 Source Forge tickets, of which 30 are still open. Continuing effort:

* i)	Developing curation guidelines for cell cycle annotation
* ii)	Reannotation to more specific terms based on revised ontology and annotation guidelines
* iii)	Writing up Cell cycle report for publication

General Ontology Development

Most new terms required by PomBase are now requested via TermGenie templates, or TermGenie free form. Many other types of ontology issues (missing parent, true path violations, term merge requests, etc.) are reported on the GO SourceForge tracker (~200 tickets in 2013–2014, 115 still open)

Outreach

Work with GO users

First opponent for Ph.D. Thesis “Laying the Foundations for Gastrin Systems Biology -Conceptual models and knowledge resources to enhance research on gastrin mediated intracellular signaling and gene regulation”. Sushil Tripathi University of Trondheim, Norway

Work with PomBase community

PomBase curators work continually with the fission yeast scientific community during literature curation. Most community curation submissions result in a dialogue with the paper author/community curator. Often the PomBase curator suggests additional terms which the author approves or rebuts. This exchange results in comprehensive and accurate curation as each paper is effectively co-curated by a curation expert with ontology knowledge and a biological expert.

Other Highlights

Annotation Quality control (Annotation intersections)

PomBase has developed many in-house QC procedures for GO annotation. We are slowly documenting these and making them more widely available. QC using BP co-annotations have been presented in posters, and on GO annotation calls. The first rules have been generated for inclusion in GO Jenkins checks, and have been used to correct annotation errors in UniProt and most MODs. These “annotation intersection” checks have also resulted in the correction of many incorrect mappings from InterPro and Uniprot (IEA) reported here

PomBase Display of annotation extensions, and new "Compact GO View" on gene pages: New Feature

PomBase was the first MOD to display annotation extensions on gene pages, and has recently introduced a new "Compact View" for GO. Further enhancements are in progress to remove redundancy from the summary view

Examples:

* http://www.pombase.org/spombe/result/SPBC11B10.09#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-biological_process

Using GO data to build interaction networks

PomBase are using GO data (physical interactions, complexes and molecular function “has_direct_input” annotation extensions) to generate interaction networks on the PomBase website using esyN (http://www.esyn.org/).

These can be viewed for GO slim processes here:

http://www.pombase.org/browse-curation/fission-yeast-go-slim-terms