Annotation Conf. Call 2016-04-26

From GO Wiki
Jump to: navigation, search

Conference call Format

Standard Bluejeans link from meeting home page: https://bluejeans.com/993661940

Agenda

IDA or IEP for an isoform annotation

Recently Karen was annotating a paper where an isoform-agnostic antibody was used to detect the presence of a protein in a cell. Subsequent experiments showed that an RNA coding for a specific isoform had the same localization, supporting that the protein isoform was localized to that component. The question came up as to whether we should use IDA or IEP for evidence to support this. Since the experiment did not address the protein isoform itself, is it an IDA?

Also worth noting that if we decide that this is IEP, we will need to change the rules on the validation script that checks GAFs since IEP is currently only allowed for Process annotations.

Switching over automated pipelines for high-throughput studies and bulk annotations

We will work with Marcos to get an ECO code to use for GO annotations that derive from high-throughput experiments. Once we have this, annotators should apply it moving forward, and also retroactively to existing HT annotations. We are hoping to design ways to help groups identify existing annotations from high-throughput studies.

How difficult would it be for submitters of GAF files to put “InterPro Consortium” in the “assigned by” field, for IEA annotations derived from that source? Likewise, for other IEA methods, we should try to identify an appropriate assigner, so we give them appropriate credit for the annotations.

Annotation Extension Announcements

  • The regulates relation and its children are now allowed in annotation extensions
  • The 'dependent_on', 'requires_substance' and 'localization_dependent_on' relations are now obsolete. If these are being used, it is causing a GAF to fail the Jenkins check. Please be sure that your group modifies or removes these extensions in a timely manner.

Annotation Consistency Exercise

Perhaps the perfect time to remind people of this: https://github.com/geneontology/go-ontology/issues/11709

From PomBase

PMID: 11493649 Fission yeast mfr1 activates APC and coordinates meiotic nuclear division with sporulation.

Related Github tickets (These are quite long so I'll attempt to summarize):

https://github.com/geneontology/go-ontology/issues/11977 The question here, was whether we needed a term for the APC complexed with its various inhibitors (researchers usually refer to these complexes as APC-(inhibitor/activator superscript) ). This gets quite complicated, because there are different activators and activators for different substrates at different cell cycle transitions, and for mitotic and meiotic cell cycles. I believe that the consensus from the GO meeting is that we don't need these terms, and we use "x complex binding" to represent these specific inhibitor/activator-complex interactions (the logic being that only the functional complex is represented in GO)?

The second issue, was that we were following a precedent, and requesting these specific "APC-activator activity" terms. https://github.com/geneontology/go-ontology/issues/11757 The solution here was simple. The terms will all be merged back to the "function" term (i.e ubiquitin ligase activity, ubiquitin ligase inhibitor activity, and ubiquitin ligase activator activity). So, although these terms still exist they will be merged very soon.

Gene Qualifier Term ID Number of annotations Evidence code with Extensions Comments
MOLECULAR FUNCTION
mfr1 protein complex binding GO:0032403 1/4 IPI cut20 has_input anaphase-promoting complex GO:0005680
mfr1 anaphase-promoting complex binding GO:0010997 1/4 IPI cut20
cdc2 cyclin-dependent protein kinase activity GO:0097472 1/4 IDA happens_during meiosis II cell cycle
BIOLOGICAL PROCESS
mfr1 sporulation resulting in formation of a cellular spore GO:0030435 1/4 IMP couldn’t decide between these P two terms [2nd requires background knowledge])
mfr1 negative regulation of ascospore formation GO:0075297 1/4 IMP couldn’t decide between these P two terms [2nd requires background knowledge])
mfr1 positive regulation of ascospore formation GO:0075296 3/4 IMP
mfr1 positive regulation of protein catabolic process GO:1903364 1/4 IMP has_input cdc13,happens_during meiosis II cell cycle phase
mfr1 positive regulation of cyclin catabolic process GO:2000600 1/4 IMP has_input cdc13, happens_during GO:0007138 (meiotic anaphase II) | causally_upstream_of GO:1904030 (negative regulation of cyclin-dependent protein kinase activity)
mfr1 anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process 1/4 IMP has_input cdc13
mfr1 negative regulation of cyclin-dependent protein kinase activity GO:1904030 1/4 IMP regualtes_o_has_agent cdc2, regulates_o_happens_during meiosis anaphase II, happens_during meiosis II anaphase II
mfr1 NTR positive regulation of meiotic exit 1/4 IMP
cdc13 NTR negative regulation of meiotic exit 1/4 IMP
cdc13 negative regulation of ascospore formation GO:0075297 4/4 IDA/IMP happens_during meiosis II cell cycle
CELLULAR COMPONENT
cut20 nucleus GO:0005634 1/4 IDA even though they say this, I am not convinced. I would look for other evidence.
mfr1 nucleus GO:0005634 1/4 IDA even though they say this, I am not convinced. I would look for other evidence.
mfr1 colocalizes_with anaphase-promoting complex 1/4 exists_during GO:0007138 (meiotic anaphase II) (I am being quite uncharitable here as they never show more than an IP and some IF)
mfr1 APC-Fzr1/Mfr1 complex GO:1990857 2/4 IPI cut20 happens_during 7135 meiosis II

Minutes

In attendance

  • Berkeley: Chris, Moni, Suzi
  • EBI: Aleks, Melanie, Paola
  • FlyBase: Helen
  • IntAct: Birgit
  • MGI: David H., Karen
  • PomBase: Antonia, Midori, Val
  • RGD: Stan
  • SGD: Edith
  • TAIR: Tanya
  • USC: Paul T.
  • WB: Kimberly
  • Zfin: Sabrina

IntAct: Birgit (4.10 pm BST)

IDA vs IEP

  • Paper looking at isoform-specific mRNA localized to a specific region of the cell
  • Annotation using IEP for the protein's CC annotation
  • IEP is usually only used for BP, so this annotation is getting flagged as an error
  • Can change EC to IDA to avoid flagging, but is this correct?
  • Should the identifier in Col. 17 be the protein isoform or an RNA isoform?
  • Is there another evidence code that could be used?
  • The spirit of the experiment is more in line with IDA
  • Will leave evidence code as IDA, no change needed to IEP evidence rules

High-throughput Experiments and Evidence Codes

  • Will work with Marcus to generate evidence codes for high-throughput experiments
  • Then users can distinguish between evidence from an HTP experiment vs evidence generated in an hypothesis-driven manner
  • We will need to define what we consider high-throughput experiments
  • Will then retrofit existing annotations

IEA Pipelines

  • Currently, in the Assigned_by field of the GAF, IEAs are attributed to the group who ran the algorithm/program
  • Propose a change to more accurately attribute the annotations
  • For example, if MGI runs InterProScan, InterPro would get credit for the Assigned_by field, but MGI would still be the source of the annotation
  • For groups doing this, would it be possible to change the Assigned_by field to reflect, and give credit to, the group that actually makes the manual mappings?

Annotation Extensions

  • Regulates and child relations are now allowed in annotation extensions
  • Obsolete relations: dependent_on, localization_dependent_on, and requires_substance
  • Jenkins will flag annotations using these relations and the GAF will fail the QC check

Annotation Consistency Exercise

  • ACTION ITEM: Check meiosis annotations to be consistent with definition: https://github.com/geneontology/go-ontology/issues/11709
  • Val - background slides
    • Anaphase Promoting Complex (APC) regulates cell cycle transitions
    • The APC has a lot of variations, although there is a core complex of 12 subunits, conserved across species
    • For example, there are different activators (Fizzy) and inhibitors
    • Authors designate different APC complexes as APC-cdc20, for example
    • Should we be creating different protein complex terms for each of these variations on the complex?
    • There would be a lot of new complex terms, then, and where would you stop?
    • For annotation, we want to be representing the active form of the complex
    • Could instead just annotate different activators and inhibitors with 'APC complex binding'?
  • Antonia - paper
    • Fission yeast background
    • Mfr1 has sequence similarity to an APC activator
    • Mfr1 expressed specifically during meiosis
    • Mfr1 null mutations results in delayed sporulation and cdc13 cyclin is stabilized
    • Mfr1 co-localizes with and interacts with APC during anaphase II
  • Annotations
    • Four different groups submitted annotations
      • Molecular Functions
        • Different annotations for APC complex binding (specific term vs AE), but they are semantically equivalent
      • Biological Process
        • Generally the same annotations, but curators chose terms with different granularity
        • Various flavors of ascospore formation, catabolic process, meiotic exit, regulation of kinase activity
        • Antonia and Val discussed whether it was more accurate to annotate to ascospore formation or meiotic exit, but regulation of ascospore formation seemed appropriate in this case.
      • Cellular Component
        • For Mfr1 - nucleus, APC complex, or individual APC complexes, e.g. APC-Fzr1/Mfr1 complex
  • Original question - do we need these specific complex terms in GO?
    • The activator is required for the complex to have its activity
    • Does the complex function differently when complexed with its different activators or inhibitors? If not, then perhaps the specific complex terms are not needed.
    • The complexes have the same activity, but different targets; does that make them different?
    • Yes it probably does because the substrate specificity allows the APC complexes to function as different machines
    • Can look at recent papers and consult with experts and follow up
    • Can we come up with a way to assess different inferences based upon the alternative representations?
    • Could load models into Protege, run OWL reasoner, do DL queries, and see if the outcomes make sense
    • ACTION ITEM: Work out SOP for testing different model representations