Annotation Conf. Call November 26, 2013: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
 
(15 intermediate revisions by one other user not shown)
Line 1: Line 1:
[[Category:Annotation Working Group]]
==Agenda==
==Agenda==


===GO_Ref for phenotype2GO mapping annotations (Kimberley)===
Present:
Here is a draft of a GO_REF for BP annotations inferred from Phenotype Ontology annotations that documents curation pipeline and entities included in the With/From column, e.g. Variation IDs, RNAi experiment IDs, Phenotype Ontology terms, Paper IDs. These annotations will have IEA with this Reference. Please note that IEAs require ID in the With column and IEA annotations >1 year will be removed.


GO_REF:0000058
SGD: Rama, Edith, Diane


Gene Ontology annotation through association of Phenotype Ontology terms with GO Biological Process terms
MGI: Judy, Li


WB
Zfin: Doug


Transitive assignment of GO terms based on mappings between Phenotype Ontology terms and GO Biological Process terms. GO terms are obtained from a file, generated manually by WormBase curators, containing mappings to Worm Phenotype Ontology (WPO) terms. Annotations made using this reference may contain one or more of the following in the With/Fro column (Column 8): 1) A variation or RNAi identifier used by WB to indicate the perturbation; 2) The WPO term; 3) A reference identifier for the publication that reported the phenotype. Mapping files will be made available here: http://www.geneontology.org/external2go/wbphenotype2go.
WB: Kimberley


EBI: Jane, Rachael, Alex
Pombase: Midori
Panther: Paul
Berkeley: Moni, Chris
RGD: Stan
Dicty: Petra, Pascale
TAIR:
UCL: Ruth
===Jenkins email alerts===
Any questions on Jenkins email alerts about your GAF files?
===Curation question===
* We noticed that there are some histone kinase terms that include the name of the residue that is being phosphorylated in the term name itself
(http://amigo.geneontology.org/cgi-bin/amigo/term_details?term=GO:0035174#lineage). Although histones are very well conserved, there are still some differences in the sequence and the serine is always not in the same position in all organisms, but often the context is the same (i.e it is +/- one or two residues).
* should we requesting a new term for each specific residue
* should there be a taxon constraint (because for example, position 14 in H2B may not have serine in all organisms)


===Column-16 curation questions===
===Column-16 curation questions===
Line 43: Line 70:
|}
|}


* {| class="wikitable" border="1"
 
* RAD51 has increased strand exchange activity in the presence of RPA complex or RPA promotes this activity. For the RPA "promoted" part, we can't use "activated_by" relationship in col 16, because this relationship is only appropriate for chemicals. We could capture this is an forward annotation for RPA: create a new term for positive regulation of strand invasion and annotate RPA to this term with target rad51 - the problem: the subunits will have to be annotated to this term and not the complex as a whole. Can activated_by relationship be relaxed for use in these situations or do we have to wait to annotate to complexes as objects.
 
{| class="wikitable" border="1"
!DB (Col 2)
!DB (Col 2)
!GO ID (Col 5)
!GO ID (Col 5)
Line 54: Line 84:
|IDA
|IDA
|PMID:8066464
|PMID:8066464
|? (RPA serves as an accessory factor or regulator, but RPA is a complex)
|? (Replication protein A complex serves as an accessory factor or activated_by, but RPA is a complex)
|-
|}


* Would you add the substrate information in col-16 for phosphorylation BP annotation if we have already captured the substrate for the Kinase activity, MF annotation?
* Would you add the substrate information in col-16 for phosphorylation BP annotation if we have already captured the substrate for the Kinase activity, MF annotation?
==Discussion==
* Jenkins: Looks like groups are receiving the Jenkins reports via email and are fixing their annotations. We will bring this up at Annotation calls for the next 4 months just to make sure everybody is on the same page.
* Histone kinase issue:
** Creating amino acid position specific terms in this case can lead to incorrect annotations. These terms should be evaluated and removed by the ontology group leaving the higher level term
** Curators should request PRO IDs for the modified/specific phosphorylated form of the histone and include that ID in col-16 with has_output relation.
* Calcineurin case: Col-16 annotations shown in the table are okay to have. Make an additional col-16 annotation to indicate that the three subunits are part of the complex using occurs_in (or part_of) GO:00005955.
** Chris: one should be able to visualize this data in LEGO to see how these fit.
** Ruth: For CNB1, the GO term already includes the detail about it being a regulator. In that case why do we have to say has_regulation_target in column 16? Why can't we just say has_input? Isn't that redundant? When is it appropriate to use has_input? We need more curation examples to sort this out (when to use has_input)
* RPA example: Can't be done with the current model. We need to be able to annotate to complexes directly to capture these details.
** can the limitation on the relationship 'activated_by' be lifted so it can be used to capture this experiment?
** Complexes working group is collecting the type of identifiers that groups would use to annotate. This project should be moved up in priority.
* col-16 for Phosphorylation in BP: if you have captured the substrate in col-16 for a Kinase activity term in MF, do we have to capture it again for phosphorylation in BP?
** Jenkins inferencing script can't handle col-16 yet. So, yes, capture it in BP also.
* TermGenie: Soon there will be a new feature in Termgenie to add Notes. THese notes will become public comments, so can't be crude. Note should be something like "an example of this term can be found in S. cerevisiae for gene X as shown in PMID with blah evidence. Ontology group will give a demo of this feature when it is ready.

Latest revision as of 16:46, 9 April 2014

Agenda

Present:

SGD: Rama, Edith, Diane

MGI: Judy, Li

Zfin: Doug

WB: Kimberley

EBI: Jane, Rachael, Alex

Pombase: Midori

Panther: Paul

Berkeley: Moni, Chris

RGD: Stan

Dicty: Petra, Pascale

TAIR:

UCL: Ruth

Jenkins email alerts

Any questions on Jenkins email alerts about your GAF files?


Curation question

  • We noticed that there are some histone kinase terms that include the name of the residue that is being phosphorylated in the term name itself

(http://amigo.geneontology.org/cgi-bin/amigo/term_details?term=GO:0035174#lineage). Although histones are very well conserved, there are still some differences in the sequence and the serine is always not in the same position in all organisms, but often the context is the same (i.e it is +/- one or two residues).

  • should we requesting a new term for each specific residue
  • should there be a taxon constraint (because for example, position 14 in H2B may not have serine in all organisms)

Column-16 curation questions

DB (Col 2) GO ID (Col 5) ev.code Reference (Col 6) Extension (Col 16)
CNB1 GO:8597 (calcium-dependent protein S/T phosphatase regulator activity) IDA PMID:1321337 has_regulation_target: CNA1, CMP2
CNA1 GO:4723 (calcium-dependent protein S/T phosphatase) IDA PMID:1321337 requires_regulator: CNB1
CMP2 GO:4723 (calcium-dependent protein S/T phosphatase) IDA PMID:1321337 requires_regulator: CNB1


  • RAD51 has increased strand exchange activity in the presence of RPA complex or RPA promotes this activity. For the RPA "promoted" part, we can't use "activated_by" relationship in col 16, because this relationship is only appropriate for chemicals. We could capture this is an forward annotation for RPA: create a new term for positive regulation of strand invasion and annotate RPA to this term with target rad51 - the problem: the subunits will have to be annotated to this term and not the complex as a whole. Can activated_by relationship be relaxed for use in these situations or do we have to wait to annotate to complexes as objects.
DB (Col 2) GO ID (Col 5) ev.code Reference (Col 6) Extension (Col 16)
RAD51 GO:recombinase activity IDA PMID:8066464 ? (Replication protein A complex serves as an accessory factor or activated_by, but RPA is a complex)
  • Would you add the substrate information in col-16 for phosphorylation BP annotation if we have already captured the substrate for the Kinase activity, MF annotation?

Discussion

  • Jenkins: Looks like groups are receiving the Jenkins reports via email and are fixing their annotations. We will bring this up at Annotation calls for the next 4 months just to make sure everybody is on the same page.
  • Histone kinase issue:
    • Creating amino acid position specific terms in this case can lead to incorrect annotations. These terms should be evaluated and removed by the ontology group leaving the higher level term
    • Curators should request PRO IDs for the modified/specific phosphorylated form of the histone and include that ID in col-16 with has_output relation.
  • Calcineurin case: Col-16 annotations shown in the table are okay to have. Make an additional col-16 annotation to indicate that the three subunits are part of the complex using occurs_in (or part_of) GO:00005955.
    • Chris: one should be able to visualize this data in LEGO to see how these fit.
    • Ruth: For CNB1, the GO term already includes the detail about it being a regulator. In that case why do we have to say has_regulation_target in column 16? Why can't we just say has_input? Isn't that redundant? When is it appropriate to use has_input? We need more curation examples to sort this out (when to use has_input)
  • RPA example: Can't be done with the current model. We need to be able to annotate to complexes directly to capture these details.
    • can the limitation on the relationship 'activated_by' be lifted so it can be used to capture this experiment?
    • Complexes working group is collecting the type of identifiers that groups would use to annotate. This project should be moved up in priority.
  • col-16 for Phosphorylation in BP: if you have captured the substrate in col-16 for a Kinase activity term in MF, do we have to capture it again for phosphorylation in BP?
    • Jenkins inferencing script can't handle col-16 yet. So, yes, capture it in BP also.
  • TermGenie: Soon there will be a new feature in Termgenie to add Notes. THese notes will become public comments, so can't be crude. Note should be something like "an example of this term can be found in S. cerevisiae for gene X as shown in PMID with blah evidence. Ontology group will give a demo of this feature when it is ready.