GO-CAM Working Group Call 2018-08-28
From GO Wiki
- 1 Meeting URL
- 2 Agenda
- 2.1 Relations between MF and Input(s)
- 2.2 Modeling Transcription in GO-CAM
- 2.3 Direct vs Unknown Mechanism of Regulation
- 2.4 Relations between BP and input(s)
- 2.5 Relations between BP and MF of transcriptional target
- 2.6 Missing Property Chains
- 2.7 Root Node vs Existing Molecular Functions
- 3 Minutes
Relations between MF and Input(s)
- has_input vs has_direct_input
- Proposal: replace has_direct_input with has_input; obsolete has_direct_input
- Need to review has_input annotations to remove any extensions that are inconsistent with GO-CAM usage, i.e. an indirect or unknown proximity for an input
- Seth retrieved, as of 2018-07-31, all MF annotations that use has_input in annotation extensions.
- Initial review:
- used to capture a regulatory effect, e.g. protein kinase activator activity, when it was not known whether the effect was direct or indirect (e.g. expression of protein or complex X increases the activity of Y)
- used to capture a regulatory subunit whose presence is necessary for the activity to occur (e.g. cyclin-dependent protein kinase)
- used to capture an enzymatic activity when it was not known if the effect on a substrate was direct or indirect (e.g. caspase-dependent but not known if it was the caspase mutated)
- used to capture an enzymatic substrate where there wasn't also a direct binding assay in the paper (e.g. testing possible chemical substrates for glucuronysyltransferase activity)
- used to capture metal ion-dependence of protein binding (e.g. Ca2+-dependent protein binding)
- used (correctly) to capture the physiologically relevant input in a binding reaction (i.e. cross-species experiment where with/from captures experimental binding partner and AE the relevant binding partner)
- Initial review:
- Relations Ontology working group (broader than just GO) that is also considering how to model participants in an MF and documentation of has_input and child relations
Modeling Transcription in GO-CAM
- Sabrina - PMID:28687631 'Clock1a affects mesoderm development and primitive hematopoiesis by regulating Nodal-Smad3 signaling in the zebrafish embryo.'
Relations between Transcription Factor MFs and Regulation of Transcription BPs
- Transcription factor activity is 'part_of' regulation of transcription
- This is consistent with the relations in the ontology and produces the correct annotations in the GPAD output file
- A consequence of this is that any regulation terms needed for annotation will have to be instantiated in the ontology
- This principle will be applied more broadly, i.e. if an entity plays a regulatory role in a process, its MF is 'part_of' some regulation of BP
Direct vs Unknown Mechanism of Regulation
Capturing Unknown Mechanism of Regulation
- If it is not known if the TF directly regulates the expression of a gene, then the input for the TF activity is left blank.
- In this case, however, it is okay to use evidence from another experiment that might have shown different context (i.e. a different gene was regulated) as supporting evidence for the TF activity.
- The curator can model the unknown mechanism of regulation by saying that the TF is part_of regulation of transcription that is causally_upstream_of_or_within the positive or negative regulation of transcription that ultimately controls the expression of the gene. The gene is then added as 'has input' to the most distal transcriptional regulatory process.
Relations between BP and input(s)
- Duplicating has_input for MF and BP results in multiple entries in the AE field of the BP annotation in the GPAD
Relations between BP and MF of transcriptional target
Missing Property Chains
- We still need relation chains that allow us to capture that a gene involved in process 1 that is upstream of process 2 is upstream of process 2. For example:
- In mode (activity-centric):
- part of o causally upstream of -> causally upstream of
- In annotation file (gene-centric):
- involved in o acts upstream of -> acts upstream of
- In mode (activity-centric):
Root Node vs Existing Molecular Functions
- Curators should always try to construct models using the known MF of a gene product, even if that MF was not specifically demonstrated in the paper they are annotating.
- Associated evidence for that MF will always point back to the paper in which the MF was interrogated.
- Creating models in this way will allow us to build on existing knowledge to create the most comprehensive and up-to-date model for a given BP.
- Proposal: if a gene product has more than one MF, curators should use either: 1) experimental data that supports the selection of one function vs another, 2) the common parent of the two functions, or 3) the biological context of the annotated process to select the most appropriate function(s) for that gene product.
- Examples: beta-catenin and PDIA6
- On call: Kimberly, Tanya, Chris G, Chris M, Dave F, Dmitry, Dustin, Edith, Giulia, Harold, Helen, Jim, Jennifer, Karen, Kevin M, Laurent-Philippe, Li, Liz, Marie-Claire, Nathan, Pascale, Petra, Rob, Sabrina, Seth, Shut-Jen, Stacia, Suzi L, Suzi A, Jae
has_input vs has_direct_input
- has_input has been used in MF annotation extensions in different ways and we need to be consistent both within this relation as well as with the child relation has_direct_input