GO-CAM Working Group Call 2018-08-21
Jump to navigation Jump to search
GO Annotation Meetings
- Tuesdays at 8am PDT
- Meeting schedule:
- 1st Tuesday: Alliance Biological Function
- 2nd Tuesday: GO Consortium
- 3rd Tuesday: Alliance Biological Function/GO-CAM Working Group
- 4th Tuesday: GO-CAM Working Group
- 5th Tuesday: ad hoc, as needed
- One Zoom URL for all - https://stanford.zoom.us/j/976175422
Modeling Transcription in GO-CAM
- Sabrina - PMID:28687631 'Clock1a affects mesoderm development and primitive hematopoiesis by regulating Nodal-Smad3 signaling in the zebrafish embryo.'
Relations between Transcription Factor MFs and Regulation of Transcription BPs
- Transcription factor activity is 'part_of' regulation of transcription
- This is consistent with the relations in the ontology and produces the correct annotations in the GPAD output file
- A consequence of this is that any regulation terms needed for annotation will have to be instantiated in the ontology
- This principle will be applied more broadly, i.e. if an entity plays a regulatory role in a process, its MF is 'part_of' some regulation of BP
Relations between MF and Input(s)
- has_input vs has_direct_input
- Is there a meaningful distinction between these two relations for MF? What are we really trying to capture with MF inputs?
- Proposal: review MF annotations using has_input
- Seth retrieved, as of 2018-07-31, all MF annotations that use has_input in annotation extensions.
- Initial review:
- used to capture a regulatory effect, e.g. protein kinase activator activity, when it was not known whether the effect was direct or indirect (e.g. expression of protein or complex X increases the activity of Y)
- used to capture a regulatory subunit whose presence is necessary for the activity to occur (e.g. cyclin-dependent protein kinase)
- used to capture an enzymatic activity when it was not known if the effect on a substrate was direct or indirect (e.g. caspase-dependent but not known if it was the caspase mutated)
- used to capture an enzymatic substrate where there wasn't also a direct binding assay in the paper (e.g. testing possible chemical substrates for glucuronysyltransferase activity)
- used to capture metal ion-dependence of protein binding (e.g. Ca2+-dependent protein binding)
- used (correctly) to capture the physiologically relevant input in a binding reaction (i.e. cross-species experiment where with/from captures experimental binding partner and AE the relevant binding partner)
- Relations Ontology working group (broader than just GO) that is also considering how to model participants in an MF and documentation of has_input and child relations
Direct vs Unknown Mechanism of Regulation
Capturing Unknown Mechanism of Regulation
- If it is not known if the TF directly regulates the expression of a gene, then the input for the TF activity is left blank.
- In this case, however, it is okay to use evidence from another experiment that might have shown different context (i.e. a different gene was regulated) as supporting evidence for the TF activity.
- The curator can model the unknown mechanism of regulation by saying that the TF is part_of regulation of transcription that is causally_upstream_of_or_within the positive or negative regulation of transcription that ultimately controls the expression of the gene. The gene is then added as 'has input' to the most distal transcriptional regulatory process.
Relations between BP and input(s)
- Duplicating has_input for MF and BP results in multiple entries in the AE field of the BP annotation in the GPAD
Relations between BP and MF of transcriptional target
Missing Property Chains
- We still need relation chains that allow us to capture that a gene involved in process 1 that is upstream of process 2 is upstream of process 2. For example:
- In mode (activity-centric):
- part of o causally upstream of -> causally upstream of
- In annotation file (gene-centric):
- involved in o acts upstream of -> acts upstream of
- In mode (activity-centric):
Root Node vs Existing Molecular Functions
- Curators should always try to construct models using the known MF of a gene product, even if that MF was not specifically demonstrated in the paper they are annotating.
- Associated evidence for that MF will always point back to the paper in which the MF was interrogated.
- Creating models in this way will allow us to build on existing knowledge to create the most comprehensive and up-to-date model for a given BP.
- Proposal: if a gene product has more than one MF, curators should use either: 1) experimental data that supports the selection of one function vs another, 2) the common parent of the two functions, or 3) the biological context of the annotated process to select the most appropriate function(s) for that gene product.
- Examples: beta-catenin and PDIA6
- On call: Kimberly, Tanya, Chris G, Chris M, Dave F, Dmitry, Dustin, Edith, Giulia, Harold, Helen, Jim, Jennifer, Karen, Kevin M, Laurent-Philippe, Li, Liz, Marie-Claire, Nathan, Pascale, Petra, Rob, Sabrina, Seth, Shut-Jen, Stacia, Suzi L, Suzi A, Jae
has_input vs has_direct_input
- has_input has been used in MF annotation extensions in different ways and we need to be consistent both within this relation as well as with the child relation has_direct_input