PomBase December 2014: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Reports]]
= PomBase Summary, 2013/2014 =
= PomBase Summary, 2013/2014 =


Line 14: Line 15:
  * Only 89 proteins have ND annotation for all 3 aspects
  * Only 89 proteins have ND annotation for all 3 aspects


PomBase  currently provide  39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code,  the IEA annotation is filtered reducing the number of IEAs from *** to 5342.
PomBase  currently provide  39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code,  the IEA annotation is filtered reducing the number of IEAs from 38647 to 5342.


ncRNAs are fully annotated using available literature (other than rRNA, tRNA, snRNA and snoRNA, most have no functional information).  
ncRNAs are fully annotated using available literature (other than rRNA, tRNA, snRNA and snoRNA, most have no functional information).  


The fission yeast literature corpus comprises ~5000 curatable publications, of which 1808 have been fully curated.
The fission yeast literature corpus comprises ~5000 curatable publications, of which 1808 have been fully curated.
2315 annotations have annotation extensions


==Methods and strategies for annotation ==
==Methods and strategies for annotation ==
Line 28: Line 31:
c. Priorities for annotation
c. Priorities for annotation


At present, PomBase curators are mainly curating historical papers. All new papers are assigned to the authors for first pass curation (currently 234 publications have been community curated).
At present, PomBase curators are mainly curating historical papers. All new papers are assigned to the authors for first pass curation.


The highest priority is to capture annotations where new information is available for an Uncharacterized gene product.  These papers are identified during the literature triage process.
The highest priority is to capture annotations where new information is available for an Uncharacterized gene product.  These papers are identified during the literature triage process.
Line 43: Line 46:


a.  Papers with substantial GO content
a.  Papers with substantial GO content


McDowall MD, Harris MA, Lock A, Rutherford K, Staines DM, Bähler J, Kersey PJ, Oliver SG, Wood V. PomBase 2015: updates to the fission yeast database. Nucleic Acids Res. 2014 Oct 31
McDowall MD, Harris MA, Lock A, Rutherford K, Staines DM, Bähler J, Kersey PJ, Oliver SG, Wood V. PomBase 2015: updates to the fission yeast database. Nucleic Acids Res. 2014 Oct 31
Line 57: Line 61:
b.  Presentations including Talks and Tutorials and Teaching
b.  Presentations including Talks and Tutorials and Teaching


2013 


April


April 2013
Workshop: Title: Gene Ontology Annotation Workshop
Workshop: Title: Gene Ontology Annotation Workshop
University College London (Antonia Lock, for Ruth Lovering)
University College London (Antonia Lock, for Ruth Lovering)


Aug
 
Aug 2013
 
 
Talk: Title:  Fission Yeast as a Model for Systems Biology  (Covering GO, QC using annotation intersections, annotation extensions, and cell cycle ontology content meeting)
Talk: Title:  Fission Yeast as a Model for Systems Biology  (Covering GO, QC using annotation intersections, annotation extensions, and cell cycle ontology content meeting)
University of Trondheim, Norway  (Valerie Wood)
University of Trondheim, Norway  (Valerie Wood)


Oct
 
Oct 2013


Lecture: Title: Biological Databases: Data Types and Connections
Lecture: Title: Biological Databases: Data Types and Connections
Computational Biology MPhil students, University of Cambridge (Midori Harris)
Computational Biology MPhil students, University of Cambridge (Midori Harris)


Nov
 
Nov 2013


Curation Workshop: Title: Genome Annotation
Curation Workshop: Title: Genome Annotation
Line 82: Line 89:
Cambridge University Part II Systems Biology  (Valerie Wood)
Cambridge University Part II Systems Biology  (Valerie Wood)


Dec
 
Dec 2013


Lecture and Practical: Title: An introduction to Data analysis and curation using PomBase
Lecture and Practical: Title: An introduction to Data analysis and curation using PomBase
Line 91: Line 99:




2014


April
April 2014


Demonstration/ presentation: Title: Canto: A tool for ontology based curation by expert curators and the scientific community
Demonstration/ presentation: Title: Canto: A tool for ontology based curation by expert curators and the scientific community
Community curation workshop, Biocurator 2014, Toronto
Community curation workshop, Biocurator 2014, Toronto
July
 
July 2014


Lecture and Workshop: Title: Using PomBase: Exploring Fission Yeast Data Online
Lecture and Workshop: Title: Using PomBase: Exploring Fission Yeast Data Online
EMBO course “Molecular genetics with fission yeast”, Institut Pasteur, Paris (Midori Harris)
EMBO course “Molecular genetics with fission yeast”, Institut Pasteur, Paris (Midori Harris)


May
 
May 2014


Workshop: Title: Gene Ontology Annotation Workshop  
Workshop: Title: Gene Ontology Annotation Workshop  
University College London (Antonia Lock, for Ruth Lovering)
University College London (Antonia Lock, for Ruth Lovering)
Oct
 
Oct 2014


Talk/Demo: Title: Canto Curation Tool
Talk/Demo: Title: Canto Curation Tool
WormBase Scientific Advisory Board, Hinxton Hall, Cambridge  (Valerie Wood)
WormBase Scientific Advisory Board, Hinxton Hall, Cambridge  (Valerie Wood)


Nov
 
Nov 2014


Curation Workshop: Title: Genome Annotation
Curation Workshop: Title: Genome Annotation
Line 124: Line 135:
c. Poster presentations
c. Poster presentations


2013


April
April 2013


Biocurator 2013, Cambridge,  UK
Biocurator 2013, Cambridge,  UK
  * i) Using biological process co-annotation for ontology and annotation quality control
* i) Using biological process co-annotation for ontology and annotation quality control
  * ii) A configurable tool for community literature curation
* ii) A configurable tool for community literature curation
 
June  2013


June EMBO Conference on Fission Yeast: Pombe 2013, UCL, London, UK
EMBO Conference on Fission Yeast: Pombe 2013, UCL, London, UK
  * i) A configurable tool for community literature curation  
* i) A configurable tool for community literature curation  
  * ii) PomBase.org
* ii) PomBase.org
  * iii) Using Gene Ontology biological process co-annotation for ontology and annotation quality control
* iii) Using Gene Ontology biological process co-annotation for ontology and annotation quality control


2014
April 2014


April British Yeast Group, Exeter, U.K. 2014
British Yeast Group, Exeter, U.K. 2014
  * i) PomBase, the scientific resource for fission yeast
* i) PomBase, the scientific resource for fission yeast


April Biocurator 2014, Toronto, Canada
Biocurator 2014, Toronto, Canada
  * i) Using Gene Ontology biological process co-annotation for ontology and annotation quality control
* i) Using Gene Ontology biological process co-annotation for ontology and annotation quality control
  * ii) PomBase: The Model Organism Database for fission yeast
* ii) PomBase: The Model Organism Database for fission yeast


==Other Highlights==


===Ontology Development Contributions===
==Ontology Development Contributions==


====Cell Cycle Overhaul====
===Cell Cycle Overhaul===


Co-organized the Cell Cycle ontology meeting at Hinxton (Feb 2013). This overhaul generated 181 Source Forge tickets, of which 30 are still open. Continuing effort:
Co-organized the Cell Cycle ontology meeting at Hinxton (Feb 2013). This overhaul generated 181 Source Forge tickets, of which 30 are still open. Continuing effort:
Line 157: Line 169:
  * iii) Writing up Cell cycle report for publication
  * iii) Writing up Cell cycle report for publication


==== General Ontology Development ====
=== General Ontology Development ===


Most new terms required by PomBase are now requested via TermGenie templates, or TermGenie free form. Many other types of ontology issues (missing parent, true path violations, term merge requests, etc.) are reported on the GO SourceForge tracker (~200 tickets in 2013–2014,  115 still open)
Most new terms required by PomBase are now requested via TermGenie templates, or TermGenie free form. Many other types of ontology issues (missing parent, true path violations, term merge requests, etc.) are reported on the GO SourceForge tracker (~200 tickets in 2013–2014,  115 still open)
== Outreach ==
=== Work with GO users ===
First opponent for Ph.D. Thesis “Laying the Foundations for Gastrin Systems Biology -Conceptual models and knowledge resources to enhance research on gastrin mediated intracellular signaling and gene regulation”. Sushil Tripathi University of Trondheim, Norway
=== Work with PomBase community ===
PomBase curators work continually with the fission yeast scientific community during literature curation. Most community curation submissions result in a dialogue with the paper author/community curator. Often the PomBase curator suggests additional terms which the  author approves or rebuts. This exchange results in comprehensive and accurate curation as each paper is effectively co-curated by a curation expert with ontology knowledge and a biological expert.
== Other Highlights==
=== Fission Yeast community curation officially  launched ===
The fission yeast  community curation initiative  was officially launched at pombe 2013 conference. Currently 234 publications have been community curated.


=== Annotation Quality control (Annotation intersections) ===
=== Annotation Quality control (Annotation intersections) ===
Line 165: Line 193:
PomBase has developed many in-house QC procedures for GO annotation. We are slowly documenting these and making them more widely  
PomBase has developed many in-house QC procedures for GO annotation. We are slowly documenting these and making them more widely  
available. QC using BP co-annotations have been presented in posters, and on GO annotation calls. The first rules have been generated for inclusion in GO Jenkins checks, and have been used to correct annotation errors in UniProt and most MODs. These “annotation intersection” checks have also resulted in the correction of many incorrect mappings from InterPro and Uniprot (IEA) reported here
available. QC using BP co-annotations have been presented in posters, and on GO annotation calls. The first rules have been generated for inclusion in GO Jenkins checks, and have been used to correct annotation errors in UniProt and most MODs. These “annotation intersection” checks have also resulted in the correction of many incorrect mappings from InterPro and Uniprot (IEA) reported here
https://sourceforge.net/p/geneontology/annotation-issues/milestone/InterPro/
* https://sourceforge.net/p/geneontology/annotation-issues/milestone/InterPro/
https://sourceforge.net/p/geneontology/annotation-issues/milestone/GOA/
* https://sourceforge.net/p/geneontology/annotation-issues/milestone/GOA/
 
=== PomBase Display of annotation extensions, and new "Compact GO View" on gene pages: New Feature ===
 
PomBase was the first MOD to display annotation extensions on gene pages, and has recently introduced a new "Compact View" for GO. Further enhancements  are in progress to remove redundancy from the summary view
 
Examples:
* http://www.pombase.org/spombe/result/SPBC11B10.09#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-biological_process


=== PomBase Display of annotation extensions : New Feature ===
=== Using GO data to build interaction networks ===


=== PomBase compact GO view : New Feature ===
PomBase are using GO data (physical interactions, complexes and molecular function “has_direct_input” annotation extensions) to generate interaction networks on the PomBase website using esyN (http://www.esyn.org/).


These can be viewed for GO slim processes here:


* PomBase curators participate in Annotation conference calls
http://www.pombase.org/browse-curation/fission-yeast-go-slim-terms

Latest revision as of 15:38, 12 December 2014

PomBase Summary, 2013/2014

Staff

Three PomBase curation staff (V. Wood, M Harris, A Lock) are working on GO-related tasks, none funded by the GO grant.

Annotation Progress

Of the 5052 protein coding genes, none-root node annotations are assigned for:

* GO:0003674 - molecular_function 3682/5052
* GO:0005575 - cellular_component 4894/5052
* GO:0008150 - biological_process 4316/5052
* Only 89 proteins have ND annotation for all 3 aspects

PomBase currently provide 39233 annotation in total. PomBase uses filtering to remove redundant IEA annotation from the GAF. If an IEA annotation is represented by an annotation with a manual evidence code, the IEA annotation is filtered reducing the number of IEAs from 38647 to 5342.

ncRNAs are fully annotated using available literature (other than rRNA, tRNA, snRNA and snoRNA, most have no functional information).

The fission yeast literature corpus comprises ~5000 curatable publications, of which 1808 have been fully curated.

2315 annotations have annotation extensions

Methods and strategies for annotation

a. Literature curation: > 95%

b. Computational annotation strategies: < 5% (for the 2911 currently unpublished gene products)

c. Priorities for annotation

At present, PomBase curators are mainly curating historical papers. All new papers are assigned to the authors for first pass curation.

The highest priority is to capture annotations where new information is available for an Uncharacterized gene product. These papers are identified during the literature triage process.

We are moving towards process- and gene-based annotation for efficiency. Effort is currently focussed on cell cycle, cytokinesis, chromosome segregation, DNA metabolism, chromatin modification and sporulation. We hope to be able to refine annotations for the well-studied genes so that they can be used to automatically generate LEGO diagrams.

PomBase curators are routinely creating terms via the new TermGenie interface to speed up the process of annotation.

PomBase curators are capturing more specificity for annotations by adding substrate, targets etc in the Annotation Extension column (aka col-16) using the protein2GO interface. ADD NUMBERS

PomBase curators are up to date with the GAF checks on Jenkins and the protein2GO error checks

Presentations and Publications

a. Papers with substantial GO content


McDowall MD, Harris MA, Lock A, Rutherford K, Staines DM, Bähler J, Kersey PJ, Oliver SG, Wood V. PomBase 2015: updates to the fission yeast database. Nucleic Acids Res. 2014 Oct 31

Dikicioglu D, Wood V, Rutherford KM, McDowall MD, Oliver SG. Improving functional annotation for industrial microbes: a case study with Pichia pastoris. Trends Biotechnol. 2014 Aug;32(8):396-9.

Huntley RP, Harris MA, Alam-Faruque Y, Blake JA, Carbon S, Dietze H, Dimmer EC, Foulger RE, Hill DP, Khodiyar VK, Lock A, Lomax J, Lovering RC, Mutowo-Meullenet P, Sawford T, Van Auken K, Wood V, Mungall CJ. A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics. 2014 May 21;15(1):155.

Rutherford KM, Harris MA, Lock A, Oliver SG, Wood V. Canto: An online tool for community literature curation. Bioinformatics. 2014

Hayles J, Wood V, Jeffery L, Hoe K-L, Kim D-U, Park H-O, Salas-Pino S, Heichinger C, Nurse P. A genome-wide resource of cell cycle and cell shape genes of fission yeast. Open Biol 2013 3: 130053. (GO used for analysis)


b. Presentations including Talks and Tutorials and Teaching


April 2013

Workshop: Title: Gene Ontology Annotation Workshop University College London (Antonia Lock, for Ruth Lovering)


Aug 2013

Talk: Title: Fission Yeast as a Model for Systems Biology (Covering GO, QC using annotation intersections, annotation extensions, and cell cycle ontology content meeting) University of Trondheim, Norway (Valerie Wood)


Oct 2013

Lecture: Title: Biological Databases: Data Types and Connections Computational Biology MPhil students, University of Cambridge (Midori Harris)


Nov 2013

Curation Workshop: Title: Genome Annotation Cambridge University Part III Biochemistry (Valerie Wood, Midori Harris, Antonia Lock)

Lecture: Title: Databases and Genome Annotation (Semantic Systems Biology) Cambridge University Part II Systems Biology (Valerie Wood)


Dec 2013

Lecture and Practical: Title: An introduction to Data analysis and curation using PomBase Wellcome Trust Fission Yeast Workshop, Hinxton, Cambridge (Valerie Wood)

Talk/ Demo: Title: Evaluating the functional annotation status of K. pastoris and implementing Canto for community curation BIOLEDGE Consortium Meeting, Wolfson College, Cambridge (Valerie Wood)


April 2014

Demonstration/ presentation: Title: Canto: A tool for ontology based curation by expert curators and the scientific community Community curation workshop, Biocurator 2014, Toronto


July 2014

Lecture and Workshop: Title: Using PomBase: Exploring Fission Yeast Data Online EMBO course “Molecular genetics with fission yeast”, Institut Pasteur, Paris (Midori Harris)


May 2014

Workshop: Title: Gene Ontology Annotation Workshop University College London (Antonia Lock, for Ruth Lovering)


Oct 2014

Talk/Demo: Title: Canto Curation Tool WormBase Scientific Advisory Board, Hinxton Hall, Cambridge (Valerie Wood)


Nov 2014

Curation Workshop: Title: Genome Annotation Cambridge University Part III Biochemistry (Valerie Wood, Midori Harris, Antonia Lock)

Lecture: Title: Databases and Genome Annotation (Semantic Systems Biology) Cambridge University Part II Systems Biology (Valerie Wood)


c. Poster presentations


April 2013

Biocurator 2013, Cambridge, UK

  • i) Using biological process co-annotation for ontology and annotation quality control
  • ii) A configurable tool for community literature curation

June 2013

EMBO Conference on Fission Yeast: Pombe 2013, UCL, London, UK

  • i) A configurable tool for community literature curation
  • ii) PomBase.org
  • iii) Using Gene Ontology biological process co-annotation for ontology and annotation quality control

April 2014

British Yeast Group, Exeter, U.K. 2014

  • i) PomBase, the scientific resource for fission yeast

Biocurator 2014, Toronto, Canada

  • i) Using Gene Ontology biological process co-annotation for ontology and annotation quality control
  • ii) PomBase: The Model Organism Database for fission yeast


Ontology Development Contributions

Cell Cycle Overhaul

Co-organized the Cell Cycle ontology meeting at Hinxton (Feb 2013). This overhaul generated 181 Source Forge tickets, of which 30 are still open. Continuing effort:

* i)	Developing curation guidelines for cell cycle annotation
* ii)	Reannotation to more specific terms based on revised ontology and annotation guidelines
* iii)	Writing up Cell cycle report for publication

General Ontology Development

Most new terms required by PomBase are now requested via TermGenie templates, or TermGenie free form. Many other types of ontology issues (missing parent, true path violations, term merge requests, etc.) are reported on the GO SourceForge tracker (~200 tickets in 2013–2014, 115 still open)

Outreach

Work with GO users

First opponent for Ph.D. Thesis “Laying the Foundations for Gastrin Systems Biology -Conceptual models and knowledge resources to enhance research on gastrin mediated intracellular signaling and gene regulation”. Sushil Tripathi University of Trondheim, Norway

Work with PomBase community

PomBase curators work continually with the fission yeast scientific community during literature curation. Most community curation submissions result in a dialogue with the paper author/community curator. Often the PomBase curator suggests additional terms which the author approves or rebuts. This exchange results in comprehensive and accurate curation as each paper is effectively co-curated by a curation expert with ontology knowledge and a biological expert.

Other Highlights

Fission Yeast community curation officially launched

The fission yeast community curation initiative was officially launched at pombe 2013 conference. Currently 234 publications have been community curated.

Annotation Quality control (Annotation intersections)

PomBase has developed many in-house QC procedures for GO annotation. We are slowly documenting these and making them more widely available. QC using BP co-annotations have been presented in posters, and on GO annotation calls. The first rules have been generated for inclusion in GO Jenkins checks, and have been used to correct annotation errors in UniProt and most MODs. These “annotation intersection” checks have also resulted in the correction of many incorrect mappings from InterPro and Uniprot (IEA) reported here

PomBase Display of annotation extensions, and new "Compact GO View" on gene pages: New Feature

PomBase was the first MOD to display annotation extensions on gene pages, and has recently introduced a new "Compact View" for GO. Further enhancements are in progress to remove redundancy from the summary view

Examples:

* http://www.pombase.org/spombe/result/SPBC11B10.09#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-molecular_function
* http://stage.pombase.org/spombe/result/SPAC24B11.06c#go-biological_process

Using GO data to build interaction networks

PomBase are using GO data (physical interactions, complexes and molecular function “has_direct_input” annotation extensions) to generate interaction networks on the PomBase website using esyN (http://www.esyn.org/).

These can be viewed for GO slim processes here:

http://www.pombase.org/browse-curation/fission-yeast-go-slim-terms