Annotation Conf. Call, June 24, 2014
EBI: Prudence, Becky, Susan, Aleks, Fredico, Paola, Rachael, David OS
Berkeley: Chris, Suzi, Moni
Pombe, Midori, Val
MGI: Li, Judy, David
SGD: Rama, Edith, Diane
Relationships that we are encouraging curators not to use in annotation extensions
During the publication of the annotation extension paper, a few points were raised about some relations that are not suitable for use in annotation extensions. We have subsequently removed these (see below) from the set of relations available in Protein2GO and visible in the visualization graph of relations. We have also made invisible most of the chained relations, e.g. regulates_o__results_in_morphogenesis_of, as these are not well defined and their usage is not clear to most curators. Note: with the exception of has_indirect_input, these have not been made obsolete, just invisible to users of Protein2GO, therefore existing annotations using these relations are not incorrect. We continue to revise the list of visible relations, so if one is missing that you want to use, let Rachael know.
1. not_dependent_on has been renamed to independent_of: annotations should be updated accordingly. Usage: not_dependent_on: SGD (10)
2. requires_regulator, requires_direct_regulator, requires_regulation_by, requires_localization, requires_target_sequence_feature: at the Bar Harbor GOC meeting it was decided to not use these relations as they are capturing the reverse annotation, we should only capture the annotation for the regulator, not the entity being regulated. Is it agreed that we can obsolete these?
Usage: requires_direct_regulator: SGD (2), requires_localization: UniProt (5), requires_regulation_by: UniProt (1) SGD (1), requires_regulator: UniProt (4) SGD (3), requires_target_sequence_feature: (0)
3. has_participant: this has never been acceptable for use in annotation extensions, although some people have used it. The fix for these annotations depends on what the evidence is, it could be has_input or has_direct_input which are children of has_participant, it might be that we can't capture some of these.
4. has_indirect_input: the annotations can be better captured using has_input or has_regulation_target.
Usage: has_indirect_input: UniProt (17) SGD (6)
5. not_during, not_exists_during, not_happens_during, not_occurs_at, not_occurs_in : these haven't been used very much and it's not clear if they are useful. We can probably capture these in a better way, so we have made them invisible to curators for now until we have a better solution.
Usage: not_occurs_in: UniProt (4), not_occurs_at: (0), not_during: UniProt (9), not_exists_during: UniProt (1), not_happens_during: SGD (1)
Summary of proposed action items from annotation extension relation meeting
1. EDITORS: Change children of ‘response to x’ from is_a to part_of, throughout GO.
2. DAVID OS. Create new more specific has_input relations for: has_substrate, has_transport_target (transports), has_binding_target (binds).
3. Change definition of ‘has_input’ to allow for its use with ‘cellular response’ terms? Currently is says ‘bound, transported, modified, consumed or destroyed’….
4. VAL AND EDITORS/DAVID HILL: look at the transcription terms. Val would like a term ‘DNA binding involved in negative regulation of transcription from RNA pol II promoter’, etc. This term would be is_a DNA binding.
5. DAVID OS: Write out the transcription factor suggestions in OWL, to check they make sense.
6. Rachael: Better define the rules for OCCURS_AT and OCCURS_IN
OCCURS_IN: All the parts of the process are contained within (CL, UBERON, GO-CC) OCCURS_AT: Adjacent to or in the vicinity of. (SO or GO-CC)
7. RACHAEL: Edit the annotation extension file to make rule that has_output can (for the moment) only be used for GO-BP annotations. Enforce this rule in Protein2GO, and add to rule file for curators not using Protein2GO.
8. PASCALE AND VAL: Look to see if we can add a restriction that has_output can be used with ‘x production’ terms in the primary annotation only, for now. We can broaden/change if necessary. E.g. cytokine production + cell adhesion molecule production.
9: RACHAEL: Add a restriction that you can’t use part_of relation between a GO process and a ‘cell cycle phase ; GO:0022403’ in C16’. This requires a happens_during’ relationship.
10. RACHAEL/RUTH: Alter local_range for HAPPENS_DURING and EXISTS_DURING to remove GO-MF information. Update wiki with this information.
11. DAVID OS: remove ‘during’ from relationships, because it can't be properly defined. It's children 'exists_during' and 'happens_during' will remain.
12. VAL and RACHAEL Look at exists_during relation uses to see if they make sense.
13. RUTH: In the part_of example on the wiki page, make it clear in a footnote that Wnt-activated R activity is already part_of Wnt signaling pathway. Here we’re making a more specific statement that the Wnt-activated R activity is part_of a CANONICAL Wnt signaling pathway.
14. DAVID OS: make a new relation: ‘adjacent_to’ to describe extracellular regions that are next to a cell.
15. CHRIS/TONY: should think more about how the on-the-fly human-readable display of folded annotations would work in practice and collaborate with Tony to see if it can be implemented in Protein2GO.
Curation consistency exercise
Enzyme Regulator activity vs positive regulation of X terms
In this paper authors show (text around Fig 3) that yeast POL30 (PCNA) stimulates the activity of yeast Apn2 (exonuclease activity).
In this paper the authors show that Aha1 stimulates the ATPase activity of Hsp90 (text around Figure 7). Aha1 name description is Activator of Hsp90 Atpase.
Both papers show similar data for enzyme activation. In the first case the authors don't call PCNA as enzymne activator, while Aha1 is named as an activator of Hsp90. How would you choose between the MF enzyme regulator term and an equivalent BP regulation term.
Expression paper (Judy/David)
- non_dependent_on- has been renamed to independent_of
- How are these stored, blacklisted? These terms should be obsoleted formally otherwise it will lead to inconsistencies and we won't be able to QC for them. Can we implement a QC check in Jenkins for these obsoleted relationships. The main relationship file is in OWL format, so it should be easy to implement this in Jenkins.
- there is no bulk way to update these relationships in protein2GO. Have to do them one at a time!
- #7, #1, #2, #8 have to be vetted more
- not_happens_during- there is one SGD annotation with this relationship. Rama will check into how this can be reannotated.
- Diane- why not annotate to both MF and BP terms?
- Midori- If you annotate to MF, BP term is inferred automatically.
- Midori- How far do we know the mechanism of how the binding affects regulation. We need to know the answer to this question to be able to confidently annotate to the function term.
- Pascale- Regulators are well conserved evolutionarily, so be conservative in annotating to regulator terms.