LEGO May 9, 2016

From GO Wiki
Revision as of 12:34, 9 May 2016 by Vanaukenk (talk | contribs)
Jump to navigation Jump to search

Bluejeans

https://bluejeans.com/969313231

Agenda

Software Updates

Evidence Cloning

  1. DavidH has been successfully using the evidence cloning feature.
  2. One nice enhancement to the evidence cloning feature would be to list a brief bibliography in the table next to the PMID, as it is unlikely that curators will be able to remember, just from the PMID, what each paper is. Can we use PubMed or Europe PMC web services for something like that? (Kimberly)

Form Interface for Curation

  • Revisit this? How high a priority is this for wider adoption of the tool?

GAF/GPAD

The Jenkins job is running again.

Still needed before we can import (into MGI):

Essential

  1. Stable purl from which to retrieve files. We will use GAF for now, but we have a ticket in to be able to use GPAD. (GH Noctua issue #200 - https://github.com/geneontology/noctua/issues/200)
  2. Folded annotations for individuals annotated by the regulates relation. (GH Noctua issue #189 - https://github.com/geneontology/noctua/issues/189)

Desirable

  1. Attribution to individual groups. For example 'MGI-Noctua'. (GH Noctua issue #84 - https://github.com/geneontology/noctua/issues/84)
  2. An automated QC check on annotations made in Noctua. For example, tagging annotations that don't have evidence. (https://github.com/geneontology/noctua/issues/255)
  3. Automated updated of models to github so that GAFs/GPADs are generated on a regular basis.
  4. Attribution of curator comments and evidence sentences to the appropriate reference. (https://github.com/geneontology/noctua/issues/280)

Determining the Extent of Upstream/Downstream to Capture in GAF/GPAD

  • Continuing discussion from last week wrt capturing annotations downstream to reflect knowledge of the biological system.
  • For an illustration, see: http://noctua.berkeleybop.org/editor/graph/gomodel:5716c41300000082
    • Fatty acylation of Wnt is required for its secretion
    • Currently, worm, fly, fish, mouse, and human acyltransferases of the porcupine family are annotated to Wnt protein secretion or some variant (direct or regulation) of Wnt signaling pathway. This reflects the biological context of the activity of these acyltransferases.
    • Additionally, Wntless protein family members (transporters, transporter chaperones?) that bind Wnt and shuttle it through the secretory pathway are annotated to a number of different Wnt-related BPs:
      • positive regulation of Wnt signaling pathway
      • positive regulation of Wnt protein secretion
      • regulation of Wnt signaling pathway by Wnt protein secretion
    • The current GAF/GPAD output only includes, as an annotation extension, the immediate downstream process which in the illustrative model is 'Golgi to plasma membrane transport'
  • Since the evidence that a protein has an activity, e.g. lipid transferase, may be different from the evidence that it affects downstream signaling and development pathways, it is necessary to create edges between the upstream activity and the downstream process in order to retrieve the desired annotations in the GAF.
  • This raises the issue of the qualifier proposal again, though. Do we want to move forward with using the expanded list of qualifiers in the GAF?
  • If yes, what are the options?
    • Only add new BP qualifiers, e.g. causally upstream of, in LEGO-generated GAF/GAPD?
    • Add a default qualifer to all BP annotations which could then be updated as curators review annotations and create new LEGO models?
    • Using existing evidence codes and ontology, try to come up with the best approximation for BP qualifiers for as many annotations as possible and then revise as needed? For example, IDA BP annotations to metabolic process terms may be good candidates for part of qualifiers, while IMP annotations to development or behavioral terms could get a less granular qualifier, the equivalent of affects.

Documentation

  • DavidH and Kimberly are using this week as a documentation week. By the end of the week, we will have a training manual for Noctua.
  • Stacia will work at SGD to create videos.

Models Discussion

cdc2 - Continuing Discussion from 2016-04-25


Minutes

  • On call: Chris, Dan Keith, David OS, David H, Giulia, Helen, Kimberly, Seth, Stacia, Suzi

Software Updates

Evidence Cloning

  • Works well for cloning and also enables cut and paste
  • Feature request - listing additional bibliographic information for papers listed as evidence
    • Could possibly re-use code currently on AmiGO
    • AI: Kimberly will open a github ticket for this feature.

Form Interface

  • +1 from SGD curators
  • High priority, but not highest at this point.
  • AI: Make a github ticket for this feature.

Evidence Model

  • Will be testing this before Heiko leaves.
  • Publications will now be individuals, so will be able to add comments (evidence sentences, for example) to specific references.

GAF/GPAD

  • Essential
  • Need stable URLs for retrieving files
  • Need folded annotations in GAF/GPAD in order to get, for example, a separate annotation to a regulation term when there is a direct regulates relation between an activity and a process
    • These inferences should be happening based on the new patterns from David OS??
    • But currently, the annotation appears as an AE
    • What about an immediate post-processing step that would generate these annotations as part of the Jenkins job since doing this as part of the Minerva pipeline could be computationally intensive?
    • Heiko currently generates the folded inferences, but these are not being generated for LEGO models
  • Desirable
    • Attribution to specific groups, GO_Noctua -> GO_Noctua_MGI
    • Automated error checking
    • Automated updated models to github
      • Should happen every 24 hours
    • Associating evidence with specific references
      • How best to model this?
        • Sentences could be their own individuals
        • Sentences would be part_of a publication
        • Sentences would then support an assertion

Capturing Extent of Causally Upstream Of in a GAF/GPAD

  • Example using Wnt pathway
  • Wnts need to be lipid modified to be secreted
  • Wnt acyltransferases are all currently annotated to some flavor of Wnt signaling or Wnt protein secretion
  • To capture this in a LEGO model we need to make a direct connection between the acyltransferase and the Wnt signaling pathway
  • Distinguish between a logically redundant assertion with specific evidence vs a stronger assertion with specific evidence
    • The mom-1 relation to Wnt signaling is currently logically redundant but with specific evidence
  • Also look at Wntless and its annotations, currently in GAFs as some form of regulation of Wnt secretion or Wnt protein signaling
  • Implications for current ontology development
    • If you characterize signaling by ligand type, do you run into problems?
  • Current GAF generation only generates AE to immediate downstream step.
  • Additional AEs to downstream steps can only come from direct links between an activity and a process.
  • This leads to question of how to generate these annotations in the GAF and the issue of expanding the qualifier column to make the association between the activity and the process clearer.
  • Contrast the translation upstream vs the acylation upstream - the former is not known to be Wnt-specific, but the latter is Wnt-specific
  • Proposal is to add these qualifiers to the GAF so we can get the annotations curators would normally make in the GAF

Documentation Week

  • David H and Kimberly will be working on LEGO documentation and a training manual this week
  • Stacia and Kevin at SGD will be helping to create Noctua videos