Annotation Conf. Call 2016-05-24

From GO Wiki
Jump to: navigation, search

Join us using BlueJeans


Fall 2016 GOC Meeting at USC

  • Reminder: the fall GOC meeting will be held at USC in Los Angeles, CA from November 4th - 6th, 2016.
  • Please indicate if you can come on the meeting logistics wiki page

Annotation Consistency Exercise - mRNA expression data

General information that should be discussed and agreed for guidelines

For a long time scientists have agreed that the level of an mRNA does not directly correlate with the level of a protein and this is, for the most part, reflected in the way that curators annotate experimental data. Similarly, the level of mRNA expression (which can be modulated e.g. by microRNAs) is not necessarily equivalent to the level of transcription; however, curators sometimes do not capture this distinction, especially when they take into consideration authors’ interpretation of results, i.e. the authors’ intent. With this in mind we would like to have an open discussion about how mRNA expression data is annotated.


(1) To understand how curators annotate non-high-throughput mRNA expression data

(2) To discuss how to improve consistency when annotating mRNA expression experiments

Word document with annotation exercises: File:Expression Transcription Examples 17 05 16 BC2.docx.


Example 1

Exemple 1.png

Example 2

Example 2.png

Example 3

Example 3.png

Discussion points

Things to consider when making annotations to regulation of gene expression or regulation of transcription and capturing a regulated target;

• Does the entity you are annotating bind DNA/chromatin?

• Is the entity you are annotating predicted to bind the promoter of the gene being regulated?

• The regulation of expression may not be direct, e.g. the entity you are annotating may have an effect on levels of miRNAs, which can silence other genes that may be responsible for the transcription of the target gene.

What evidence should we look for in order to confidently say that the entity is regulating transcription?

• Luciferase assay demonstrating that the entity binds the promoter of the gene being regulated?

• Promoter binding studies?

• DNA/Chromatin binding?

• qRT-PCR + TAS/database info re: experimental evidence of promoter binding?

Published guidelines on curation of transcription factors;

Previous discussion on using ChIP assay evidence;

Automated inference pipeline

DOS: Berkeley runs an inference pipeline that generated inferred annotations using logical inference from extended annotations. (This has sometimes been referred to as folding. It should not be confused with simple inference across part relations between major GO branches). These inferences are not yet being consumed outside of AMIGO. I'd like present a few examples of these inferences and investigate what the blockers are to disseminating them more widely.


  • On call:
    • EBI - Melanie, Paola, Penelope
    • FB - Giulia, Helen
    • MGI - Li
    • RGD - Shur-Jen
    • SGD - Edith, Stacia
    • TAIR - Tanya
    • UCL - Barbara, Rachael
    • WB - Kimberly
    • Zfin - Sabrina

Annotation Consistency Exercise - Gene Expression vs Gene Transcription

  • We reviewed the three curation examples provided by UCL to discuss how GO curators annotate experiments where loss of, or treatment with, a particular gene product results in changes to mRNA transcript levels
  • There were three different gene products in question:
    • T-bet transcription factor
    • Angiotensin II ligand
    • CYR61 extracellular protein
  • One of the main issues is what evidence is sufficient for curators to choose a regulation of transcription term over the less granular regulation of gene expression term
    • Does term selection depend on the predicted or demonstrated function of the gene product?
      • It seems that many curators take this into consideration
      • Evidence for function may come from sequence similarity or from experiments in another publication
    • What types of experiments are reported for investigating promoter-binding activity and/or effects on transcript levels?
      • RT-PCR
      • Luciferase assays
      • Nuclear run-on
      • Northern blots
      • Physical interactions
        • EMSA
        • One-hybrid
        • ChIP

AI:UCL will draft a decision tree for annotating gene expression experiments. Proposal will be circulated for review and we will discuss on a future annotation call. The long-term plan is to add the decision tree to the curation documentation on the GO web site.

Inferred Annotations from Annotation Extensions

  • David OS presented a review of how new inferred annotations (folded annotations) can be generated from existing annotations with annotation extensions
  • This is the beginning of an ongoing discussion we will have about how these annotations are generated, how they fit into the ontology design patterns, and how they can be incorporated into GAFs
  • There is sometimes a fuzzy boundary between when new terms are created in the ontology and when annotation extensions are used
  • The inferred annotations are generated as part of the Jenkins pipeline and are available for groups to incorporate into their GAFs
    • The Jenkins job will add a new annotation to the GAF that is either an exact match to an existing ontology term or to the closest existing parent term
      • cell migration results in movement of (ventral cord motor neuron) -> motor neuron migration
    • Can there be partially folded annotations where some of the annotation extensions can fold into a new term, but others can't? Yes, it seems this can happen.
  • Regulates relations are now allowed in annotation extensions, so these may result in additional inferred annotations
  • What is the provenance and evidence codes for inferred annotations?
    • Provenance should remain the original group that made the annotation; evidence code should also be the same unless people feel that IEA better reflects how these annotations are generated
  • We have been trying to clean up annotation extensions to fix incorrect usage
    • Would it be better to wait until more of the inference pipeline is worked out or should we continue to clean them up?
    • Seems it is still worth it to continue to clean up AEs
  • It would be really nice to have immediate feedback to curators in a curation tool about what the folded annotation would look like
    • This would help curators understand if they've used annotation extension relations correctly
  • David OS is working with Chris and Heiko to work out and document the details of this pipeline
  • We will try to sort out any bottlenecks for including these annotations in the GAFs