Binding and column 16

From GO Wiki
Jump to: navigation, search

Return to GOC Main Page

Return to Binding terms working group page

Go to protein binding Annotation consistency]

The inclusion of the use of column 16 here seems appropriate, although not discussed in binding discussion at GOC. So in the last paragraph include something like: column 16 can be used to increase the detail of the GO annotation. For example the specific substrate for an enzyme with 'GO:0032451 demethylase activity' can be included in column 16.


Comment from Emily

Perhaps the documentation should make clear that there is a difference to annotating to a term such as peptidase activity and using column 16 to provide information on its specific target... when the format of targets in column 16 comes online (which seems to be worthwhile doing where the GO term does not indicate a precise target in its definition), but that it is not beneficial for users to additionally generate a second annotation for this protein using the term 'protein binding' that again provides the target as interactor in the 'with' field (column 8); this second annotation doesn't appear to add much information to the first and would contravene the agreement to avoid annotating to binding terms to substrates/products. Although perhaps we should avoid as much as possible detailing the precise contents of column 16 on this page, as column 16's format for targets is currently under discussion?

Regarding annotation examples, there are a number of existing annotations from groups that specify in the 'with' field (column 8) the identity of the protein being bound for annotations to 'protein transporter activity', for instance. These annotations are completely valid, as the 'with' field is providing addition information to support the use of the IPI evidence code. Again, once targets can be included in column 16 I expect these annotations will be updated to include the target protein in both column 8 and 16. Such annotation examples demonstrate again that it is unnecessary to additionally provide a second annotation using descendants of the GO term 'binding' (GO:0005488)?

e.g PMID: 10231363; as annotated by FlyBase for Fbp1:

FB FBgn0000639 Fbp1 GO:0008565 protein transporter activity FB:FBrf0108645|PMID:10231363 IPI FB:FBgn0002562 F gene taxon:7227 20060803 FlyBase

'The uptake of the LSPs requires a specific device that binds to these hexamerins and helps to transport them across the fat body membrane. Such a hexamerin receptor or binding protein has been discovered in the fleshfly, S. peregrina[34 <http://www3.interscience.wiley.com/cgi-bin/fulltext/121360538/main.html,ftx_abs#b34>], and the blowfly, C. vicina[7 <http://www3.interscience.wiley.com/cgi-bin/fulltext/121360538/main.html,ftx_abs#b7>,26 <http://www3.interscience.wiley.com/cgi-bin/fulltext/121360538/main.html,ftx_abs#b26>]. These proteins display strong sequence similarities to the D. melanogaster FBP-1 [18 <http://www3.interscience.wiley.com/cgi-bin/fulltext/121360538/main.html,ftx_abs#b18>]. It was tempting to assume that FBP-1 has a similar function. Here we show that indeed FBP-1 is able to bind to LSP-1 in vitro (Fig. 4 <http://www3.interscience.wiley.com/cgi-bin/fulltext/121360538/main.html,ftx_abs#f4>). '

And for with contents for enzymatic activity, again I think there is a good reasons for groups wanting to indicate the target of terms such as peptidase activity, or kinase/phosphatase activity in column 16. It also seems benefit to be able to link a growth factor receptor with its the binding growth factor - but only in column 16 in the annotation line describing the receptor activity.

e.g. PMID: 100830168; as annotated by MGI for FGFR2

UniProtKB P21802 FGFR2 GO:0005007 fibroblast growth factor receptor activity PMID:10830168 IPI UniProtKB:P09038 taxon:9606 20091005 MGI

If we were intending to generate such stable annotation resource describing when to annotate terms in the 'binding' node, it might be worth starting with the some (hopefully!) uncontroversial paragraphs and examples that demonstrate the current agreed usage of the 'binding' (and descendant) terms? For instance:

  • that binding annotations, especially those where the GO term does not specify a particular binding partner, should where possible, indicate the interacting partner in the 'with' column describe the usage of pipes etc in the with column.
  • ask that protein binding annotations should be reciprocal - that if protein A is annotated as binding protein B, the reverse annotation should be provided

Comment from Ben

The scaling here is incredibly poor. Many kinases and peptidases act on dozens or 100s of targets. I suppose it's fine as a stop-gap emergency place to put information, but there needs to be a scalable plan for this type of data if it is to be incorporated in the GO database. Currently there is not even a plan to store column 16 in the database. In fact, as column 16 is constructed there are no rules for what the various Ontology IDs referenced could be referring to. It has to be guessed at from the ontology type.

e.g. PMID: 100830168; as annotated by MGI for FGFR2

UniProtKB P21802 FGFR2 GO:0005007 fibroblast growth factor receptor activity PMID:10830168 IPI UniProtKB:P09038 taxon:9606 20091005 MGI

This annotation is bizarre. FGFR2 has growth factor receptor activivty "Inferred from protein interaction with P09038 (FGF2)"? Wouldn't that be IDA? (I read the abstract, it's a crystal structure of the complex... this is one of those weird cases that "everyone knows" they are a receptor ligand pair -- the other 4 annotations to FGFR2 are all "NAS")

And furthermore, this term is a "Gene product term" and should probably be changed because it has "fibroblast growth factor" in the term name. Seems like the "receptor" graph is just as messed up as the peptidase graph.

It should probably just be (for molecular function) "growth factor receptor" For example, FGF2 has the MF GO:0008083 "growth factor activity" not "Fibroblast growth factor activity" (which would be specific to a related class of gene products).

"The FGF" ness of the of protein/receptor complex should be in the biological process tree.

So, I wouldn't use this as an example of anything.

If we were intending to generate such stable annotation resource describing when to annotate terms in the 'binding' node, it might be worth starting with the some (hopefully!) uncontroversial paragraphs and examples that demonstrate the current agreed usage of the 'binding' (and descendant) terms? For instance:

  • that binding annotations, especially those where the GO term does not specify a particular binding partner, should where possible, indicate the interacting partner in the 'with' column describe the usage of pipes etc in the with column.
  • ask that protein binding annotations should be reciprocal - that if protein A is annotated as binding protein B, the reverse annotation should be provided

Sadly, you will never get consensus on this for protein binding. That is why such guidelines were not included in the last round.

SGD and MGI, 2 of the largest contributors and co-PIs absolutely, fundamentally disagree on whether these are valid annotations or not. So any change is going to have to come by dictat.