Annotation Advocacy progress report for 2014: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Advocacy_and_Outreach]]
[[Category:Advocacy_and_Outreach]]
==Management==
===Management===


Rachael Huntley (UniProt-GOA)
Rachael Huntley (UniProt-GOA)
Line 6: Line 6:
Rama Balakrishnan (SGD)
Rama Balakrishnan (SGD)


==Annotation Consistency==
===Annotation Consistency===


'''Annotation calls'''
'''Annotation calls'''
Line 16: Line 16:
''Jenkins reports:'' Jenkins validates GAF files for both hard and soft QC checks and generates reports. Groups are reminded via email to check the reports for their GAF files and to correct any errors.
''Jenkins reports:'' Jenkins validates GAF files for both hard and soft QC checks and generates reports. Groups are reminded via email to check the reports for their GAF files and to correct any errors.


''GO Wiki and SVN changes'' GO repository and wiki were moved to a new machine
''Annotation extension:'' Discussion on the correct use of relationships in col-16. Consistency exercises have been provided on several calls where we discuss specific annotations using extensions with the aim of reaching a consensus on how to annotate the example.
''Annotation extension:'' Discussion on the correct use of relationships in col-16. Consistency exercises have been provided on several calls where we discuss specific annotations using extensions with the aim of reaching a consensus on how to annotate the example.


Line 21: Line 22:


''Protein oligomerization terms:'' Use of these terms is inconsistent. We would like to provide guidelines on how to annotate using these terms. Options include using PRO IDs in the annotation extension field or annotating directly to a complex ID (when this facility is in place). The ontology is being overhauled in this area.
''Protein oligomerization terms:'' Use of these terms is inconsistent. We would like to provide guidelines on how to annotate using these terms. Options include using PRO IDs in the annotation extension field or annotating directly to a complex ID (when this facility is in place). The ontology is being overhauled in this area.
''Apoptosis''


''E1, E2, E3 Ub ligase terms'': Overhaul of these terms and discussion on how to reannotate
''E1, E2, E3 Ub ligase terms'': Overhaul of these terms and discussion on how to reannotate
''Small conjugating ''
 
''Enzyme Regulator activity terms''
 
'' Guidelines for annotating complexes as objects''
'' Guidelines for annotating complexes as objects''
'' Use of ECO terms for evidence ''
'' Use of ECO terms for evidence ''
'' Discussion on tracking Unpublished references'': Groups are requested to create GO_Refs for unpublished references instead of MOD specifc internal references. This will help in QC.
'' Discussion on tracking Unpublished references'': Groups are requested to create GO_Refs for unpublished references instead of MOD specifc internal references. This will help in QC.


==New Annotation Datasets==
''Symbiotic relationships'' in the ontology and how to annotate host-cell interactions.
 
''Extracellular branch of ontology''
 
''Phase terms in Process ontology. The ontology team has introduced phase terms in the ontology to represent distinct periods when processes occur. These terms do not have relationship (is_a) to any other process terms and they all also have a do_no_annotate tag.
 
'''Other topics''': We also had Sandra Orchard and David OS talk about the IntAct Complex Portal and OWL language that is being used to model the ontologies.
 
===New Annotation Datasets===
* Data update from [http://www.pseudomonas.com/ PseudoCAP] group (use of ISS to transfer annotations to genes from other strains of pseudomonas)
* Data update from [http://www.pseudomonas.com/ PseudoCAP] group (use of ISS to transfer annotations to genes from other strains of pseudomonas)
* Assist the Synapse group at Broad Institute to make annotations for Synapse genes
* Assist the Synapse group at Broad Institute to make annotations for Synapse genes


==Documentation==
===Software Infrastructure (Berkeley)===
We finished porting annotation guidance webpages to the Drupal site. We are in the process of cleaning up and updating the content.
====Support of the automatic annotation Quality Control pipeline====
* Created a Continuous Integration System for monitoring, quality control, and publication of annotations (using Jenkins). The monitoring site includes statistics and metrics of data quality. We continue to support and extend the pipeline on an ongoing basis.
* Documented patterns used for extended annotations together with members of GO annotation group.
* Developed a specification for publishing GO annotations as linked data (OBAN) in collaborated with the Parkinson team / CCTV project at EBI
* Supported the display of annotation extensions in AmiGO
* Created a system for verification of annotation extension relations


==GO Helpdesk==
===Documentation===
* We finished porting annotation guidance webpages to the Drupal site. We are in the process of cleaning up and updating the content.
===Annotation to Complexes as Objects===
We are working on guidelines to annotate and produce annotations for complexes.


'''GO helpdesk staff'''
===Introduction of coma and pipe in With column===
We will introduce coma and pipe in the With column for IPI and IGI to represent AND and OR respectively. This usage will be consistent with col-16 data.


Rama Balakrishnan, Rachael Huntley, Harold Drabkin, Jane Lomax, Kimberly Van Auken, Tanya Berardini, Rebecca Foulger, Prudence Mutowo-Meullenet, Poala Roncaglia, David Osumi-Sutherland.  
===NEW QC checks===
* We now require that for each annotation, there be either a PMID or a GO_REF as reference.
* RCAs are that older than 1 year should be removed
* annotations to terms that have do_no_annotate tag
* binding (GO:5488) got the tag do_not_annotate.


We have recruited 4 new curators to join the GO-help rotation. A FAQ was published on the GOC website to point users to some of the common questions about GO.
===Plans for 2015===
* Release new GAF file format to accommodate pipe and coma in the with/from column
* Continue to assist external annotating groups who wish to provide annotation sets to the GO Consortium by providing them with the GO Consortium's annotation policies and guidelines and reviewing their annotations as necessary
* Continue to promote annotation consistency through regular annotation calls and jamborees and through up-to-date guideline documentation
* Come up with guidelines to annotate complexes as objects
* Enforce annotation guidelines by implementing further quality control checks as needed
* Ensure use of the new Evidence Code Ontology is adopted by all annotating groups within the GO Consortium by promoting its use and providing up-to-date documentation
* Continue to keep curators informed of updates to the ontology by providing documentation and holding conference calls

Latest revision as of 16:19, 19 December 2014

Management

Rachael Huntley (UniProt-GOA)

Rama Balakrishnan (SGD)

Annotation Consistency

Annotation calls

We continue to discuss annotation issues on our biweekly annotation calls. Subjects can be anything from new annotation guidelines, quality control checks, ontology or evidence code-related discussions, tool development or updates from annotating groups.

Some of the discussion we have had this year include;

Jenkins reports: Jenkins validates GAF files for both hard and soft QC checks and generates reports. Groups are reminded via email to check the reports for their GAF files and to correct any errors.

GO Wiki and SVN changes GO repository and wiki were moved to a new machine Annotation extension: Discussion on the correct use of relationships in col-16. Consistency exercises have been provided on several calls where we discuss specific annotations using extensions with the aim of reaching a consensus on how to annotate the example.

Discussion of use of pipe vs coma in the With/From Field for IPI and IGI evidence codes

Protein oligomerization terms: Use of these terms is inconsistent. We would like to provide guidelines on how to annotate using these terms. Options include using PRO IDs in the annotation extension field or annotating directly to a complex ID (when this facility is in place). The ontology is being overhauled in this area.

Apoptosis

E1, E2, E3 Ub ligase terms: Overhaul of these terms and discussion on how to reannotate

Enzyme Regulator activity terms

Guidelines for annotating complexes as objects

Use of ECO terms for evidence

Discussion on tracking Unpublished references: Groups are requested to create GO_Refs for unpublished references instead of MOD specifc internal references. This will help in QC.

Symbiotic relationships in the ontology and how to annotate host-cell interactions.

Extracellular branch of ontology

Phase terms in Process ontology. The ontology team has introduced phase terms in the ontology to represent distinct periods when processes occur. These terms do not have relationship (is_a) to any other process terms and they all also have a do_no_annotate tag.

Other topics: We also had Sandra Orchard and David OS talk about the IntAct Complex Portal and OWL language that is being used to model the ontologies.

New Annotation Datasets

  • Data update from PseudoCAP group (use of ISS to transfer annotations to genes from other strains of pseudomonas)
  • Assist the Synapse group at Broad Institute to make annotations for Synapse genes

Software Infrastructure (Berkeley)

Support of the automatic annotation Quality Control pipeline

  • Created a Continuous Integration System for monitoring, quality control, and publication of annotations (using Jenkins). The monitoring site includes statistics and metrics of data quality. We continue to support and extend the pipeline on an ongoing basis.
  • Documented patterns used for extended annotations together with members of GO annotation group.
  • Developed a specification for publishing GO annotations as linked data (OBAN) in collaborated with the Parkinson team / CCTV project at EBI
  • Supported the display of annotation extensions in AmiGO
  • Created a system for verification of annotation extension relations

Documentation

  • We finished porting annotation guidance webpages to the Drupal site. We are in the process of cleaning up and updating the content.

Annotation to Complexes as Objects

We are working on guidelines to annotate and produce annotations for complexes.

Introduction of coma and pipe in With column

We will introduce coma and pipe in the With column for IPI and IGI to represent AND and OR respectively. This usage will be consistent with col-16 data.

NEW QC checks

  • We now require that for each annotation, there be either a PMID or a GO_REF as reference.
  • RCAs are that older than 1 year should be removed
  • annotations to terms that have do_no_annotate tag
  • binding (GO:5488) got the tag do_not_annotate.

Plans for 2015

  • Release new GAF file format to accommodate pipe and coma in the with/from column
  • Continue to assist external annotating groups who wish to provide annotation sets to the GO Consortium by providing them with the GO Consortium's annotation policies and guidelines and reviewing their annotations as necessary
  • Continue to promote annotation consistency through regular annotation calls and jamborees and through up-to-date guideline documentation
  • Come up with guidelines to annotate complexes as objects
  • Enforce annotation guidelines by implementing further quality control checks as needed
  • Ensure use of the new Evidence Code Ontology is adopted by all annotating groups within the GO Consortium by promoting its use and providing up-to-date documentation
  • Continue to keep curators informed of updates to the ontology by providing documentation and holding conference calls