Annotation Advocacy progress report for 2014
Rachael Huntley (UniProt-GOA)
Rama Balakrishnan (SGD)
We continue to discuss annotation issues on our biweekly annotation calls. Subjects can be anything from new annotation guidelines, quality control checks, ontology or evidence code-related discussions, tool development or updates from annotating groups.
Some of the discussion we have had this year include;
Jenkins reports: Jenkins validates GAF files for both hard and soft QC checks and generates reports. Groups are reminded via email to check the reports for their GAF files and to correct any errors.
GO Wiki and SVN changes GO repository and wiki were moved to a new machine Annotation extension: Discussion on the correct use of relationships in col-16. Consistency exercises have been provided on several calls where we discuss specific annotations using extensions with the aim of reaching a consensus on how to annotate the example.
Discussion of use of pipe vs coma in the With/From Field for IPI and IGI evidence codes
Protein oligomerization terms: Use of these terms is inconsistent. We would like to provide guidelines on how to annotate using these terms. Options include using PRO IDs in the annotation extension field or annotating directly to a complex ID (when this facility is in place). The ontology is being overhauled in this area.
E1, E2, E3 Ub ligase terms: Overhaul of these terms and discussion on how to reannotate
Enzyme Regulator activity terms
Guidelines for annotating complexes as objects
Use of ECO terms for evidence
Discussion on tracking Unpublished references: Groups are requested to create GO_Refs for unpublished references instead of MOD specifc internal references. This will help in QC.
Symbiotic relationships in the ontology and how to annotate host-cell interactions.
Extracellular branch of ontology
Phase terms in Process ontology. The ontology team has introduced phase terms in the ontology to represent distinct periods when processes occur. These terms do not have relationship (is_a) to any other process terms and they all also have a do_no_annotate tag.
Other topics: We also had Sandra Orchard and David OS talk about the IntAct Complex Portal and OWL language that is being used to model the ontologies.
New Annotation Datasets
- Data update from PseudoCAP group (use of ISS to transfer annotations to genes from other strains of pseudomonas)
- Assist the Synapse group at Broad Institute to make annotations for Synapse genes
Software Infrastructure (Berkeley)
Support of the automatic annotation Quality Control pipeline
- Created a Continuous Integration System for monitoring, quality control, and publication of annotations (using Jenkins). The monitoring site includes statistics and metrics of data quality. We continue to support and extend the pipeline on an ongoing basis.
- Documented patterns used for extended annotations together with members of GO annotation group.
- Developed a specification for publishing GO annotations as linked data (OBAN) in collaborated with the Parkinson team / CCTV project at EBI
- Supported the display of annotation extensions in AmiGO
- Created a system for verification of annotation extension relations
- We finished porting annotation guidance webpages to the Drupal site. We are in the process of cleaning up and updating the content.
Annotation to Complexes as Objects
We are working on guidelines to annotate and produce annotations for complexes.
Introduction of coma and pipe in With column
We will introduce coma and pipe in the With column for IPI and IGI to represent AND and OR respectively. This usage will be consistent with col-16 data.
NEW QC checks
- We now require that for each annotation, there be either a PMID or a GO_REF as reference.
- RCAs are that older than 1 year should be removed
- annotations to terms that have do_no_annotate tag
- binding (GO:5488) got the tag do_not_annotate.
Plans for 2015
- Release new GAF file format to accommodate pipe and coma in the with/from column
- Continue to assist external annotating groups who wish to provide annotation sets to the GO Consortium by providing them with the GO Consortium's annotation policies and guidelines and reviewing their annotations as necessary
- Continue to promote annotation consistency through regular annotation calls and jamborees and through up-to-date guideline documentation
- Come up with guidelines to annotate complexes as objects
- Enforce annotation guidelines by implementing further quality control checks as needed
- Ensure use of the new Evidence Code Ontology is adopted by all annotating groups within the GO Consortium by promoting its use and providing up-to-date documentation
- Continue to keep curators informed of updates to the ontology by providing documentation and holding conference calls