Annotation Call November 11, 2014
GAF checks on Jenkins Dashborad
1) For terms like transport and transporter activity, transcription factor/transcription do we have guidelines on if we should add col-16 data to both aspects?
2) chromatin, new relationship 'RO_0002008: Conincident with' (no underscore)
PMID:22426534 uses ChIP to show that pombe Pfh1 localizes to regions within chromatin where the DNA sequence contains replication fork barriers (e.g. SO:0001914) and at some specific highly transcribed genes. I can easily annotate to nuclear chromatin in CC, but there's important specificity to put in extensions if possible.
Application of the new RO_0002008: Conincident with relationship: definition "A relation that holds between two linear structures that are adjacent or overlapping and are approximately parallel to each other for their entire length. "^^string comment "Example: if we define region of chromosome as any subdivision of a chromosome along its long axis, then we can define a region of chromosome that contains only gene x as 'chromosome region' that coincident_with some 'gene x', where the term gene X corresponds to a genomic sequence."
Resulting annotation: pombe Pfh1 GO:0000790 nuclear chromatin IDA, Annotation extension:RO_0002008: Coincident with SO:0001914 replication fork barriers
Annotation of ChIP data
Discussed [wiki page] and [source forge] In summary A ChIP experiment does not indicate whether interaction is direct or not since chromatin complexes are all cross-linked with DNA, they only show colocalization.
General agreement that it is acceptable to annotate to the Cellular Component term 'chromatin' (or child terms) with SO and Gene IDs in the annotation extension field using the 'Coincident with' relationship (as above). Use SO terms to specify the DNA element (e.g. UAS, FRE, TCS etc)
Need clarification on annotation to a Molecular Function term: chromatin binding (or child terms)
Is it acceptable to annotate to chromatin binding (MF term, or child terms) based on ChiP data; with occurs_at SO and/or has_direct_input Gene IDs in the annotation extension field?
Need to consider whether the authors have used appropriate controls if including SO and Gene IDs
Is it acceptable to annotate to DNA binding (MF term, or child terms) based on ChiP data if the authors are confident that the protein is a DNA binding protein? (e.g. http://www.pombase.org/spombe/result/SPBC32H8.11).
Next Call (Preview)
small conjugating enzyme ontology development (DavidH and Val)
1) The ontology edits are complete for all of the ubiquitin and other small conjugating enzymes. There are three parents: activating enzyme activity (E1), conjugating enzyme activity (E2) and ligase activity (E3)
2) Now all of the annotations for these need to be checked. We need a strategy for this re-annotation.
Checkpoint terms guidelines
In attendance: Rama, Moni, Suzi, Edith, Kimberly, Midori, Donghui, Stacia, Pascale, Rachael, Prudence, Judy, Susan, Jane, Paola, Alex, David, Tanya, Li
- Jenkins dashboard
- several different files
- GAF validation report - tells which are the offending rows and why
- we want feedback if the reports are not informative enough
- will report both hard and soft checks (e.g., do not manually annotate)
- GAF summary - how many rows were flagged, what ontologies were used, etc.
- gene_association - inferences made using inter-ontology links
- Please look at the validation reports and if the report doesn’t make sense give feedback on how to make it better.
- Blue ball - all good
- Yellow - a few offending rows
- Red - lots of offending rows
- Rachael - not all GOA files have links to error reports
- Rama - submitted a github ticket for this, but will need to follow up with Heiko
- Web site editing - use Drupal system
- Can get an account from Seth if needed
- Column 16
- transport, transporter activity
- transcription, transcription factor activity
- Are targets captured in both MF and BP?
- Groups do different things - some in both, others just in MF.
- Kimberly - add to function for enzyme/substrate
- need to check about transcription - may add to both
- Ruth - add to both
- David - may not be consistent with other groups
- will need to review annotations once ontology group works out what relations to use
- have put extensions in both MF and BP, but these may be chained (e.g., regulates relations)
- ontology group has been going through the annotation relations, some are being removed, others better defined
- ontology group will come up with guidelines
- Are there two separate issues - 1) whether to put relations on both function and process (can we use inheritance here?) and 2) what relations to use
- Annotation extension relations will be included in the ontology file - this will allow correct folding/unfolding
- Action: ontology group will have sub-meetings and report back to annotators
- Annotations to ChIP experiments
- Midori - want to annotate a gene product that localizes to chromatin based on ChIP experiments
- general consensus seemed to be that ChIP wasn't evidence for DNA or protein binding, but what about chromatin binding?
- If not MF annotation, then just a CC annotation to chromatin
- New twist - ChIP data showing proteins localizing to specific parts of chromatin based upon the DNA sequence folded into the chromatin
- This looked like a good use of a CC extension (nuclear chromatin) using a SO term and/or a gene ID
- What relation to use?
- David OS suggested coincident_with
- Is there a more intuitive name for this relation?
- Colocalizes_with? Is that an option?
- Colocalizes_with has been used as a qualifier on the whole annotation, can it also be used in annotation extensions?
- ChIP experiments seem to be annotated in different ways - using CC or MF
- Chromatin is a broad term - includes a lot of entities (DNA, protein)
- What does the community think about these experiments?
- What is the goal of ChIP experiments?
- Li - to determine binding, not necessarily localization
- Where would users think to go to get this information - MF or CC?
- Annotate to both or just one?
- Midori - users looking for association with chromatin and with what parts of the chromatin, rather than is it really, truly binding
- Donghui - sounds like users mostly care about binding
- If we agree on using the function term, then we can use occurs_at relation
- Pascale - if we have information for both aspects, we should annotate to both.
- Ruth - seems odd to just say chromatin binding, but not also chromatin
- Definition of chromatin in GO: The ordered and organized complex of DNA, protein, and sometimes RNA, that forms the chromosome.
- Should we check with chromatin experts?
- Do we annotate histones the same way as we would annotate transcription factors, wrt chromatin?
- Pascale - analogous situation with ribosome and ribosome binding?
- One proposal - annotate to both, but would need to help curators to implement this consistently
- Do not use ChIP data to annotate to a specific DNA binding term - ChIP data doesn't typically have that level of resolution
- Action - ontology and annotation groups will continue to discuss