LEGO August 15, 2016

From GO Wiki
Jump to navigation Jump to search

Bluejeans URL - NOTE NEW MEETING TIME: 8:00am PST

https://bluejeans.com/969313231

Agenda

UK Training Session

Training Documenation

  • Some comments and edits made to the Quick Start guide
  • In response to Melanie's comment, do we want to say anything about folding in this QuickStart Guide?
  • Added a section on making models public and annotation state, but need some more input on exact behavior for some states
  • Section of GAFs/GPADs will need updating as we make more progress on that front
  • Videos?

Software Updates

NEO Overview and GPI Files

  • Questions, issues still to be sorted out?
    • We have entries for:
      • Genes
      • Proteins
      • Transcripts
      • ncRNAs
      • Protein Complexes
    • Convening on specs for Noctua
      • MODs should include UniProtKB GCRP accession as the db_xref for their gene-level entries
      • Protein, transcript, ncRNAs can include UniProtKB isoform accessions, PRO accessions, ENSEMBL IDs, RNACentral IDs in db_xref field
      • Acessions and IDs in db_xref field will be used for purposes of searching (in Noctua, and possibly also in AmiGO), but annotations in the models will be associated with the primary ID entered in each group's gpi file
      • Human gene products will use UniProtKB accessions as primary IDs
    • If groups don't have parent transcript or protein IDs, what ID should be used in Noctua and with what relation?
      • For example, if a curator needs to specify any mRNA transcript of a gene to add context to an MF annotation, should they use:
        • has_input(WB:WBGene00004804) OR has_input_some_product_of (WB:WBGene00004804) OR has_input_some_mRNA_transcript_of (WB:WBGene00004804)
    • How should protein complexes be represented?
  • Next steps - documentation of contents, communication of pipeline to other groups

MGI Meeting Follow Up

  • Review the list of software and annotation issues that were discussed at the MGI training session, June 15th-16th.
  • See the Google doc
  • Some specific follow-up:
    • GAF/GPAD output is probably highest priority
      • Remaining issues:
        • How to handle causal chains
        • Multiple evidence = multiple lines in the GAF
    • Using a limited set of relations in Noctua to make it easier for curators to find what they need github ticket 165

LEGO Relations

  • Some specific issues have come up wrt LEGO relations that need clarification
  • The first concerns what to use to express the relationship between two activities, the acetylcholine transporter and the acetylcholinesterase, to the acetylcholine receptor in the Drosophila memory model
    • The transport and acetylcholinesterase are proposed to regulate the receptor activity by removing acetylcholine from the synaptic cleft, so there is a clear hypothesis about their mechanism of action
    • Currently, we have 'directly inhibits' to express the relation, but curators don't feel this accurately reflects the biology as there isn't a direct physical interaction between the gene products
    • However, use of 'negatively regulates' doesn't quite say enough, as there is a proposed molecular mechanism for the regulation
    • The RO has the relation 'directly provides input for' but does it have one for 'removes input for' (or 'directly removes iput for')? If so, would this be the correct relation to use?
  • The second issue concerns whether we should create additional relations for the 'causally upstream of or within' and 'causally upstream of' to indicate directionality, i.e. positive or negative
    • If we are going to stipulate that some understanding of mechanism is required to select a 'regulates' relationship, then having the directionality in the less granular parent relations would allow curators to still capture the effect of a gene product's activity on a process, even if the mechanism is not yet known.
    • These types of relations will probably be used widely to describe gene product effects based on mutant phenotypes

Minutes

  • On call: Chris, Giulia, Jim, Kimberly, Midori, Pascale, Ruth, Sabrina, Seth, Stacia

Noctua-Minerva Mailing List

  • We realized that not everyone who regularly attends the call is on the noctua-minerva mailing list administered by LBL.
  • ACTION ITEM: Review members on the list against regular call attendees; add people not on the list.
  • ACTION ITEM: If you want to be added to the list, please email Seth.

UK Training Session

  • Reviewed meeting agenda
  • ACTION ITEM: Add a session at the beginning on the motivation behind LEGO. Kimberly - DONE
  • ACTION ITEM: Contact EBI to see if there is anything more they'd specifically like to discuss wrt database issues and pipelines at the meeting. Chris - DONE

Training Documenation

  • We reviewed the QuickStart Guide.
  • Clarified some issues about the tags on annotation state.
  • ACTION ITEM: Have a annotation and model download information page ready before the meeting.
  • Videos - relegated this to the important but not urgent list for UK training session since we will be doing a lot of hands-on training at that workshop.
    • Could get a site license for Camtasia, so we can generate videos readily.
    • Include both text and speaking for the videos since people may prefer one or the other, or both.
    • May not need brand new scripts; we can build on what was already done.

Software Updates

NEO Overview and GPI Files

  • Chris re-built NEO using MGI gpi files
  • Groups will be encouraged to provide a gpi file by the end of the year, but the pipeline for building NEO will be to access a gpi file first, then if none exists, use the entities annotated in the GAF.
  • Chris and Seth will look into adding updating NEO so that proteins can be added as individuals via their is_relation to chemical entities.
  • We discussed what db_xrefs should be included in the gpi files.
    • Minimally, the MODs should include the UniProtKB GCRP accession in their gene-level entries.
    • Other entries?
    • This could be a good project for the AGR to coordinate, since these types of mapping files are useful beyond just Noctua and GO and having a standard format would be good.
    • Will also work on setting up a more efficient pipeline for getting the gpi files, e.g. downloading directly from a MOD ftp site to the GPAD/GPI directory in GO's SVN.
  • Wrt the human gpi file, it should probably have both UniProtKB accessions and HGNC IDs so that proteins and genes can be referred to separately, when needed.
  • For generic transcript and protein IDs, there are two options:
    • Each MOD would include the UniProtKB parent accession (where it exists) or a PRO accession for the generic protein product (if it exists)
      • Issues: What about cases where a parent UniProtKB accession does not exist? Here we cannot distinguish between whether the curator meant the parent entity or is using a GCRP to indicate a generic protein.
    • Another option is to use relations that allow curators to use a MOD gene ID, but the relation indicates what is meant by that, i.e. 'has_input_some_product_of' or 'has_input_some_mRNA_product_of'.
    • One option that will likely NOT be used is to have some downstream pipeline try to figure out the correct semantics.
  • Protein complexes in Noctua
    • It is possible to create a protein complex in Noctua, but how annotations will be propagated to each member for the GAF/GPAD outputs is not figured out yet.
    • If any complexes exist in current gpi or GAFs, then those IDs (usually IntAct) should already be available for use in Noctua.

MGI Meeting Follow Up

  • Working on order of relations for the Noctua pop-up menu so that the GO relations will appear at the top of the list.
  • Still need to work on adding functionality for context-sensitive relations.

LEGO Relations

  • We discussed the use of the 'directly inhibits' relation in the Drosophila memory model.
  • It seems like perhaps a good fit for the relation between both the acetyltransferase and acetylcholine transporter and the acetylcholine receptor is 'removes input for'.
  • 'removes input for' is the converse relation of 'provides input for' and more explicitly describes the role of the two regulatory activities than just a 'negatively regulates' relation, but does not imply direct physical interaction as in 'directly inhibits'.
  • We briefly touched on adding directionality to the 'causally upstream of or within' and 'causally upstream of' relations. Will need to re-visit this on the next call.