Binding discussion emails

From GO Wiki
Jump to: navigation, search

Links

2010_GO_camp_binding_documentation_issues

2010_GO_camp_binding_documentation_issues#Agenda_for_discussion_26-05-2010


Using column 16 in conjunction with an catalytic activity annotation

Would it be possible to include a further discussion on column 16 in the agenda for this call next week?

As we discussed at the previous call, the Annotation Extension field (column 16) is intended to capture information to qualify or enhance an annotation. Some of the proposed contents have been outlined by Chris here: [1]

- one suggested value was an identifier indicating the intended in vivo target of an annotation. This would mean that curators would be able to further specify a GO term where the term's definition was unable to precisely indicate the exact chemical/gene product substrate/product (we have agreed to advise curators not make annotations to terms that specifically describe the interaction with substrates/products where the definition of the molecular function term can identify the exact participants. See: [2].

Therefore if a curator would like to capture information that a protein kinase phosphorylates a specific target

  • for instance, in PMID:17182001 GSK3beta phosphorylates Sufu to regulate Hedgehog signalling, as Sufu is a substrate of the GSK3beta kinase, going along with the guidelines we previously agreed we would prefer if curators did not to annotate directly to both the catalytic activity term AND also describe the binding to its substrate:
    • GSK3beta GO:0005515 protein binding IPI with=Sufu
    • GSK3beta GO:0004672 protein kinase activity IDA
  • It would seem more informative if the curator could specify in the molecular function annotation for protein kinase activity, that the substrate of this interaction is Sufu:
    • GSK3beta GO:0004672 protein kinase activity IDA col16=has_input[Sufu]

Which would mean not only that the target of GSK3beta's activity could be captured, but also that the binding annotation could be implied or automatically generated (if so desired).

Current annotations such as

    • GSK3beta GO:0004672 protein kinase activity IPI with='Sufu'

...are less informative, as the annotation is only able to state that the activity of GSK3beta has been inferred by the fact that it binds to Sufu - with this annotation format the curator can not explicitly state the identity of Sufu as the substrate of the interaction.

- an additional advantage of using column 16 to capture 'in vivo' targets, would be that we could provide users with a consistent location to identify all annotation targets. This is in contrast with the multiple uses of the 'with' field (column 8), where users need to reinterprete the value in column 8 depending on the evidence code used. It seems that if we want our users to make full use of our data, we should place it somewhere where the its meaning is consistent across the whole annotation set.

The proposal from the last call was that where annotation groups are able to use column 16 in their annotation system, then a valid use would be to identify 'in vivo' targets of an interaction. Therefore curators would capture targets of molecular function annotations here.

As the participants for a molecular function term that are children of binding (GO:0005488), are regularly captured in the with field (column 8), then it may be possible to add some of these existing interactant identifiers into column 16 directly.

This would mean that the 'with' field (column 8) would display the participant directly used by the investigators in the presented paper, and would continue act to support the generation of the IPI-evidenced annotations, however column 16 would represent the expected 'in vivo' participant. While in many cases, the participant identifiers displayed in column 8 and 16 may be the same, however this may not always be the case - e.g. where an experiment has used proteins from different species with the intent of indicating the in vivo binding interaction (ie. a demonstration of mouse protein A binding human protein B would be intended to indicate the in vivo situation of mouse protein A is expected to bind mouse protein B and that the human protein A binds human protein B).

Does this sound like something we could discuss now and take the decision on this to the GO camp?

NOTE

Of the 30,000 manual annotations to the GO term (or child terms) GO:0003824 catalytic activity, only 162 annotations use the IPI evidence code

Use of column 16 to capture in vivo participant

The Annotation Extension field (column 16) is intended to capture information to qualify or enhance an annotation. Some of the proposed contents are outlined here: http://wiki.geneontology.org/index.php/Annotation_Cross_Products - one suggested value was an identifier indicating the intended, in vivo target of an annotation. This would mean that curators would be able to further specify a GO term where the term's definition is unable to precisely indicate the exact chemical/gene product target (we have agreed not to annotate to targets where a molecular function term does identify the exact participants. See: http://wiki.geneontology.org/index.php/Binding_Guidelines). - this 'in vivo' target in column 16 would provide users with a consistent location for annotation targets. This is in contrast with the multiple uses of the 'with' field (column 8), where users need to reinterprete the value in column 8 depending on the evidence code used. If we want our users to make full use of our data, we need to place is somewhere where the its meaning is consistent. Proposal: where groups can use column 16, then a valid use would be to identify 'in vivo' targets of an interaction. Therefore curators would capture targets of molecular function annotations here, in the format:

Examples has_participant(protein 1) --has_input(protein 2) --has_output(protein 3) (shall we find an examples for these?)

As the participants for a molecular function term that are children of binding (GO:0005488), are regularly captured in the with field (column 8), then it may be possible to add these existing interactants into column 16 directly.

Example:

column 8 would display the participant used in the presented results, and act to support the generation of the IPI-evidenced annotations column 16 would represent the 'in vivo' participant. While in many cases, the participant identifiers displayed in column 8 and 16 may be the same, however this may not always be the case - e.g. where an experiment has used proteins from different species with the intent of indicating the in vivo binding interaction (ie. a demonstration of mouse protein A binding human protein B to indicate the in vivo situation of mouse protein A is expected to bind mouse protein B, human protein A binds human protein B).

If this proposal is agreed by the binding group, then it could be presented to the GO list, and at the camp in June as a valid use of column 16.

A concern with this proposal is that in cross species experiments the 'in vivo' target annotation is inferred by the curator or author based on orthology and does not have direct experimental support.