Annotation Advocacy progress report for 2013: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 35: Line 35:
We are currently working with GeneDB and [http://www.pseudomonas.com/ PseudoCAP] to assist them in providing an updated annotation file.
We are currently working with GeneDB and [http://www.pseudomonas.com/ PseudoCAP] to assist them in providing an updated annotation file.


We also worked with XX group in Norway on annotating mammalian transcription factors (PMID: )
We also worked with Astrid Lægreid's group (Norway) on annotating mammalian transcription factors (PMID: 23981286)


==Annotation Guidelines==
==Annotation Guidelines==

Revision as of 00:51, 12 December 2013

Management

Rachael Huntley (UniProt-GOA)

Rama Balakrishnan (SGD)

Annotation Consistency

Annotation calls

We continue to discuss annotation issues on our biweekly annotation calls. Subjects can be anything from new annotation guidelines, quality control checks, ontology or evidence code-related discussions, tool development or updates from annotating groups.

Some of the discussion we have had this year include;

IEP evidenced annotations: If a group annotates using IEP for gene expression, then the annotation is moved to Protein2GO/UniProt and displayed as a protein accession, does this affect the interpretation of the annotation? Do users make inferences about whether the expression is protein vs. gene? After discussion, the consensus was that no inference should be made about what entity was expressed. The documentation for IEP evidence has been updated to reflect this detail. However, to help clarify the situation, we could make more granular IEP evidence codes in ECO to describe protein expression and gene expression.

Apoptosis recuration: Following the ontology development of the apoptosis branch of GO, several groups recurated their annotations to the top-level terms. Discussions on the types of terms to use were had in various annotation calls.

Jenkins reports: Jenkins validates GAF files for both hard and soft QC checks and generates reports. Groups are reminded via email to check the reports for their GAF files and to correct any errors.

Annotation extension: a) the suitability of transferring annotation extensions by ISS, b) subsets of relations for display in curator tools, c) consistency exercises have been provided on several calls where we discuss specific annotations using extensions with the aim of reaching a consensus on how to annotate the example.

Discussion of Database Identifiers Used in the With/From Field

Protein oligomerization terms: Use of these terms is inconsistent. We would like to provide guidelines on how to annotate using these terms. Options include using PRO IDs in the annotation extension field or annotating directly to a complex ID (when this facility is in place).

Protein binding: Discussions on whether to move protein binding annotations to a separate file. Also the possibility of automatically deepening the 'protein binding' term to a more granular one, depending on the type of protein being bound.

Predicted Annotations: We are working on coming up with a system to incorporate annotations that are predicted using a combination of methods (usually annotated with RCA evidence code)

New Annotation Datasets

We have been providing annotation guidance to the MoonProt group, who want to annotate moonlighting (multi-function) proteins, and GeneDB, who annotate pathogen gene products.

We are currently working with GeneDB and PseudoCAP to assist them in providing an updated annotation file.

We also worked with Astrid Lægreid's group (Norway) on annotating mammalian transcription factors (PMID: 23981286)

Annotation Guidelines

New GO_REF for annotating apoptosis using IC to indicate background knowledge from review papers was also used in the judgement.

The apoptosis curation manual (written by Paola Roncaglia and Pablo Porras Millan) is available here.

Documentation

We are currently in the process of porting, and updating, annotation guidance webpages from the current GO Consortium website to the Drupal site.

Annotation Quality Control

Automatic repair of annotations to merged or obsoleted terms: Ontology operations such as term merges and obsoletions may be out of sync with annotation releases. Each GO entry T in the GAF is checked to see if it corresponds to a valid (non-obsolete) term in the ontology. If not, metadata for other terms is checked. If the term has been merged into a term S (i.e. S has alt_id of T) then T is replaced by S in the GAF line. Optionally, if T is obsoleted and there is 1 or more replaced_by tags to S1, S2,... Sn, then the GAF line is replaced by n lines with the GO entry changed to S1..Sn.

Discussion of Database Identifiers Used in the With/From Field: We performed a review of the types of identifiers being used in the with/from field in all groups' annotation files. We had two aims for this; 1) to determine which identifier types are suitable for use in the with/from and 2) to determine which identifier types should/should not be used with each evidence code. We identified several identifiers that are not appropriate for use because they do not indicate a specific entity or are not a stable/current ID, e.g. 'CBS:TMHMM', 'TIGR_TIGRFAMS'. We also decided which identifier types should be used with each evidence code and whether or not they are mandatory. This information will be encoded into ECO so that sanity checks can be designed.

GO Helpdesk

GO helpdesk staff

Rama Balakrishnan, Rachael Huntley, Harold Drabkin, Jane Lomax, Kimberly Van Auken, Tanya Berardini, Rebecca Foulger, Prudence Mutowo-Meullenet, Poala Roncaglia, David Osumi-Sutherland.

Change of JIRA hosting

Since July 2013, all queries to GO are now directed to the new GOJIRA system hosted at TAIR. To divide up the work of answering gohelp queries, we use a gohelp rota where each person is 'on duty' for 1 week at a time. It is the responsibility of the person on duty to make sure that all queries are answered, either by answering it themselves, or by assigning the issue to someone else to answer. The on duty person is now encouraged to 'tweet' interesting GO-related stories to keep up the public profile of GO.

Other

Plans for 2014

In the upcoming year we will;

  • Continue to promote annotation consistency through regular annotation calls and jamborees and through up-to-date guideline documentation
  • Enforce annotation guidelines by implementing further quality control checks as needed
  • Complete the re-organization of the annotation documentation provided by the GO Consortium through its website
  • Continue to assist external annotating groups who wish to provide annotation sets to the GO Consortium by providing them with the GO Consortium's annotation policies and guidelines and reviewing their annotations as necessary
  • Develop methods to track provenance of evidence and have them formulated correctly in Evidence Code Ontology (ECO)
  • Ensure use of the new Evidence Code Ontology is adopted by all annotating groups within the GO Consortium by promoting its use and providing up-to-date documentation
  • Continue to keep curators informed of updates to the ontology by providing documentation and holding conference calls