Binding Terms Survey Comments

From GO Wiki
Revision as of 01:16, 8 June 2009 by Ruth.lovering (talk | contribs) (Question 6)

Jump to: navigation, search

Back to binding terms conference call wiki

Question 1

The GO consortium should remove all annotations which describe catalytic domain substrate binding, associated with catalytic activities or transport proteins, and no longer annotate these interactions.

Comments:

  1. Terms should be annotated to binding terms only if there was an experiment that demonstrated binding. There should not be coannotation for substrate binding if an experiment demonstrates catalytic activity. Instead the catalytic annotation should be made.
  2. I disagree, but one should only annotate to binding if binding is directly demonstrated.
  3. I dissagree with the proposal. If paper has experiment for a kinase that demonstrates ATP binding, then we should annotate it. There must be an experiment since the annotation would be an IDA. What if an unknown protein is shown to bind DNA; then 5 years later, it is shown to have helicase activity. It would be counterproductive to then have to remove the first annotation.
  4. Sympathetic to the proposal but still not completely decided. On the wiki page the question was asked: "Are there cases where a user would want to search for gene products that bind a particular substrate but have different molecular functions? " One answer to this springs to mind, the ubiquitin conugating E2 enzymes. Many of these proteins have lost the catalytic function yet nevertheless bind ubiquitin. For instance, hMMS2/UBE2V2 (catalytically inactive) forms a heterodimer with UBC13 (catalytically active). hMMS2/UBE2V2 binds the donor ubiqutin and orients the ubiquitin lys-63 toward the acceptor site in UBC13. In this case we would have to annotate one member of the heterodimer as ubiquitin-binding and the other one as not doing so, as this subunit is also catalytically active. While I think you can justify this, many users would probably find this confusing. Another example are the argonaute proteins, EIF2C1/AGO1 EIF2C2/AGO2, EIF2C3/AGO3 and EIF2C4/AGO4. All 4 of these proteins bind to an siRNA or miRNA, which serves as a guide for complementary mRNAs. In the case of AGO1, 3 and 4, the bound complementary mRNA is subject to translational repression. Therefore for these 3 proteins you would annotate to mRNA binding as the mRNA is not really a substrate and they lack any kind of catalytic activity. AGO2 on the other hand does have a catalytic activity but also acts like AGO1/3/4. For AGO2 the mRNA is either subject to translational repression (when the miRNA exhibits imperfect complementarity), or subject to endonucleolytic cleavage by AGO2 (generally when the complementarity is perfect - see GO:0070551). So AGO2 could be annotated to mRNA binding but only for papers that demonstrated translational repression, not cleavage. Again I think users would probably find this confusing.
  5. I agree that if there is evidence supporting a catalytic activity then related binding annotations are not useful. However, there are many cases where for example a suspected GTPase has been demonstrated to bind GTP, but has not been shown to catalyze the hydrolysis, in which case it becomes GTP binding or nothing. So I would say my vote is a qualified yes, its a good idea but what would we do in cases like this?
  6. I like Jim's suggestion of retaining high level binding terms. Realistically, we are not going to have time to revaluate every binding annotation so by moving things to a generic binding term as least we don't lose the functional info and the retrofit is feasible. However, given that we are not set up for col 16 due to database constraints and also our curators are likely resist the additional burden of having to figure out the appropriate chebi IDs, I suspect there will be a loss of info in terms of what the interacting molecules are for new annotations.
  7. This is entire survey too binary - in principle, I agree we need to limit the scope of the "binding" terms but keep in mind that there are biological functional "binding" functions.
  8. If i have evidence for GTPase activity, I would annotate it to just that term. Dual annotation doesn't make sense. But if I have evidence, say based on some domain comparison, that the gp has ATP binding domain, then I would annotate to just ATP binding with ISS.
  9. Annotate at the level of what function is shown in the experiment. If the experiment shows GTPase activity, then annotate to GTPase activity. If it shows GTP binding, then annotate to GTP binding.

Question 2

Change the definition of 'x binding' terms to explicitly exclude catalytic domain substrate binding. Comments:

  1. For all the questions involving x binding redefinition, I am opposed to precomposed terms of the type "x binding"
  2. Do not change the definition, change annotation practice.
  3. When I read a paper and they show that it binds GTP, then I would like to be able to curate that fact, especially when I have no other molecular function information to give. The author is unlikely to make any unequivocal statement stating that it binds GTP but definitely doesn't hydrolyze it. And, if I have to go read other papers that do separate experiments which show that it is "NOT" hydrolyzed, can I really use the original reference as my sole source for that annotation? It seems that I have to draw an inference from multiple experiments to know that it does A (binds GTP), but definitely doesn't do B (use it as a substrate). I like to try to limit my annotations to what is directly shown by a particular experiment when possible.
  4. Again, binding should only be annotated if it is directly demonstrated.
  5. It would be an ENORMOUS amount of work to go back through annotations to decide whether the term was used 'correctly' or not
  6. I am not sure removing all of the "incorrect" annotations is necessary. I view it as "redundant information", which should be eliminated because it doesn't add information.
  7. Annotation of such binding should only be made when the experiment specifically shows such binding, when the point of the experiment is to prove such binding, not inferred by the annotator because of the presence or requirement of a cofactor or reagent in the reaction mix.
  8. In many older biochemistry papers, binding was used as an indicator that the protein might have catalytic activity on the binding partner. Ex: one step in the purification of a helicase might be a DNA column, but other proteins that bind to the column might not have helicase activity. It would still be useful to capture the DNA binding activity of these proteins even if they are subsequently shown to have catalytic activity on DNA. Also, it's useful to have a sense of history in GO, i.e. what was known when, so that we can see how partial information (DNA binding) might develop into a clearer picture (topoisimerase).
  9. I agree with all of the cons; I reject all of the pros.
  10. I don't see why this applies: "In future curators would not be able to annotate proteins for which the only information presented is that they bind 'x' (with no indication of the context of this binding - is it a substrate/cofactor/something else?). " If they bind 'x' then they could be annotated as such. In the future if 'x' is discovered to be a substrate, the previous annotation would have to be removed.
  11. I dont think it's possible to do annotation if one annotation (x binding) depends on another annotation (here, catalytic activity).
  12. This seems to me to be more in line with the intentions of GO and I am not sure that losing these types of annotation is really a bad thing in most cases. However there are perhaps specific cases where this could be very confusing for users as described comments on point 1. On the other hand, if the only information available for a protein is that it binds zinc, or a lipid etc, then that is not much information at all and the significance of a paper that publishes only these kinds of results is perhaps questionable. If you have 2 papers and paper 1 shows cholesterol binding and paper 2 demonstrates a role for the protein as a cholesterol transport protein, I would annotate paper 2 and leave paper 1 aside. See also point 3 below.
  13. On the condition that there were terms for non-substrates (ie cofactors, allosteric regulators, etc).
  14. See above. Same concerns if this is all that has been experimentally determined.
  15. I don't think this will work. While we all know that we should read definitions before annotating if you leave the terms in without making this change explicit in the name curators will keep using them incorrectly. Also, what do you do if it isn't clear whether it is a substrate or not? If you don't annotate you lose the 'binding function' info - if you do annotate it may turn out later to be wrong. GO annotation is only one aspect of the data that is captured from a paper - curators simply don't have time to revise old data in the light of new info during routine curation. Once an annotation is made it is likely to stay there for a long time.
  16. It is often not possible to make the distinction, e.g. because there is insufficient information. It would be paradoxical to annotate "GTP-binding" only when GTP has a regulatory rule. Plus, would be very difficult to implement.
  17. Given your definition, it doesn't seem like the con "curators would not be able to annotate proteins with limited information" is a valid point, as it would only apply to proteins that contain catalytic domains (which wouldn't fall into the "all we know is that it binds protein" category).
  18. I don't think i understand the con part! If the only evidence you have is that gpX has a ATP binding domain, then that is all the annotation you can make. gp can bind to ATP for so many reasons, not always for hydrolysis?
  19. I think that if curators annotate a catalytic substrate to binding that is OK, but I think the practice should be discouraged. I want to slow down the proliferation of binding terms for every possible substrate in the biological realm.
  20. What if 'catalytic domain substrate' binding is what the assay demonstrates? We'd want to capture that.

Question 3

Change the definition of the' x binding' terms to explicitly exclude catalytic domain substrate binding AND make grouping terms for the activity molecular function terms to indicate the type of substrate being chemically changed (e.g. new GO term: 'catalytic activity; ATP hydrolysing')

Comments:

  1. How is this different from ATPase activity?
  2. I'm sorry that I'm a bit confused by this. Is this indicating that ATP is being used to drive another reaction. Am I supposed to annotate to this term only if they prove that ATP has been hydrolyzed? If so, it seems like I still would not be able to capture a lot of "incomplete" information when this has not been tested.
  3. You say "in addition to the benefits associated with proposal 1..." - there were no benefits listed for proposal 1.
  4. I agree with the cons
  5. I agree with this suggestion, assuming it is possible
  6. I am in general in favor of trying to keep track of ATP utilizing enzymes, but am unsure of the best way to do it, nor am I sure that GO is the best place for this.
  7. The argument proposed by the "cons" seems to me to be applicable to the actual functions (such as enzymatic activities) themselves, irrespective of whether or not you specify the cofactor. When defining enzymatic activities you obviously have to specify (at least in general terms) what the substrate is; this in spite of the clear caveat that all possible substrates have not been exhaustively tested. The same logic could be applied to cofactors: an enzyme could employ either ATP, GTP, both, or indeed other NTPs. Therefore, the use of a hierarchical classification of enzymatic activities or molecular functions should naturally allow the annotator to apply his or her own judgement as to what the appropriate level of specificity should be for the definition of both substrate and cofactor.
  8. I think this would make the GO terms expand too much... what if the assay was performed with ATP-gamma-S? Would there be a GO term for every substrate tested? This would get out of hand, would it not?
  9. Not sure how this solves the problem mentioned in comments to question one. That is, one could not annotate to 'catalytic activity; ATP hydrolyzing' if the protein binds ATP but has not been demonstrated to hydrolyze it how will this help?
  10. difficult to implement.
  11. Annotate what is shown experimentally only. I agree with the 'con' statement.

Question 4

Annotate to 'x binding' terms only when a gene product is found either to bind 'x' and not alter it (e.g. as a cofactor) AND when the only information available for a gene product is that they bind 'x.

Comments:

  1. Annotate based on the experiment, not what happens to the bound substance.
  2. Do we have to invalidate the old curation once we do find that it hydrolyzes ATP? This was mentioned in the working group document. In a sense, if someone did a search for ATP-binding, what they would really be getting are the "ATP-binding, not further tested or curated + ATP-non-hydrolyzing" group and that doesn't seem ideal.
  3. Annotate to 'x binding' terms only when a gene product is found to bind 'x'. Period. Inclusion of 'either' with 'AND' in the question is confusing. I think 'OR' is intended.
  4. As before, annotation of such binding should only be made when the experiment specifically shows such binding, when the point of the experiment is to prove such binding, not inferred by the annotator because of the presence or requirement of a cofactor or reagent in the reaction mix.
  5. This makes sense: define binding as not including known catalytic activity, and define catalytic activity as including binding. To clean up the old annotations, just search for papers that annotate the same protein to both terms.
  6. If you are going down the road of remiving substrate binding, then I wonder if perhaps it might not be better to remove cofactors too. Cofactor binding can't be thought of as a molecular function, and in many cases the cofactor requirements are investigated with perhaps less rigour than the substrate specificity. If you did this then you would be reliant on an external classification that stores such information such as the EC classification.
  7. Again, if we exclude substrate binding, GO terms need to be available for the cofactor or allosteric regulator, but it is not necessary (in my opinion) to further define the exact chemical. Info about the specific type of cofactor or regulator could be put into another column (ie. isn't column 16 for something like this?).
  8. Seems like this would solve my concerns. If a GTPase has been shown to hydrolyze GTP then, don't need to annotate to GTP binding, however it would, if all that is known is that it binds GTP. We would just need to educate our users that GTP binding is implicit in annotation to GTPase activity. As long as the tree is accurate, then all gene products that bind GTP can be identified.
  9. difficult to implement: often we do not have enough information (now), but a few months later, everything is different: this would be impossible to implement, and confusing for users.
  10. I think that the only time you should annotate "x binding" terms is when the only function is "x binding" or when no other information is available. I don't think it is worthwhile to make an exception for non-catalytic binding.
  11. Too restrictive.

Question 5

Create two 'x' binding terms: those describing substrate binding interactions and those describing cofactor binding interactions.

Comments:

  1. I am very naive and I apologize for that. Let's say a transporter binds and hydrolyzes ATP to pump a proton. Although, technically, ATP is certainly a substrate of that transporter, is that what most users would think of as a "substrate" of a transporter? In principle, I like this distinction, but again, I worry that most authors, when showing that protein X binds Y will not tell us whether Y is acting as cofactor or a substrate. In some cases, it might be easy to infer, but, not in all cases. And how would this affect IEA annotations? Do these state whether the bound molecule acts as a cofactor or a substrate?
  2. I don't think this is necessary.
  3. Presumably there would be a parent term, e.g. ATP binding, in order for this to work. This would be my second choice if something HAS to change.
  4. Who needs to be told that GTPase binds GTP?
  5. Just annotate to what the experiment shows.
  6. I only agree with this solution if there is a common 'binding' parent e.g. ATP binding ---ATP binding, ATP as substrate ---ATP binding, ATP as cofactor for when you don't know what sort of binding is taking place. I don't see expansion of terms as a problem.
  7. I think this is too complex. Agree with the "cons" that GO is probably not the best place for such information.
  8. Quite like this idea but not the idea of retrofitting our annotations - ok if it can all migrate to a common parent for now.
  9. the same molecule can sometimes be a co-substrate, and at other times a cofactor that is not changed during the reaction. Plus, there are all those enzymes where the mechanism is not known. This distinction would be difficult to implement
  10. Again, what I would like to avoid is having a specific GO term for every molecule that a gene product binds as a substrate or cofactor. I think there should be one new term for regulated by a small molecule allosteric effector and then put the CHEBI code for the molecule in column 16.
  11. Can the specific substrate/cofactor not be captured in column 16 with a more generic term (any of the direct children of 'binding') as the GO id?

Question 6

Create a relationship in the ontology such that if an annotation is made to a catalyst term then we also know that the gene product is annotated to a binding term for the substrates e.g. add a new relationship: 'GTP binding' involved in 'GTPase activity'.

Comments:

  1. sorry- I'm confused.
  2. I prefer not to make implied annotations.
  3. Better to do it by definitions. See comments to #4 above.
  4. This is complicated, and I am not sure I understand the argument of the cons here. ATP-binding would have a mixed bag of associations - presumably many activities would "point" to ATP-binding - but why is that a problem as it reflects biological reality?
  5. This choice is made in the opinion that we go all the way and try to solve the 'binding' issue however best we can or else leave things the way they are.
  6. The usage notes in AmiGO, GONUTS or OboEdit could take care of this... and make it clearer to any annotator that this is implicit.
  7. Seems like this should be implicit in the catalysis term 'GTPase activity'. As such, as long as the tree is structured correctly, seems obvious that a GTPase has to bind GTP to hydrolyze it.
  8. In the case of implied annotation, a new evidence code should be used to indicate the relationship/dependency (something similar to IC (inferred by curator) that could be called IR 'inferred by relationship'). In opposite, an explicit annotation would have a 'better' evidence code.
  9. Yes, this sounds like a good idea, except: how to deal with big complexes that have catalytic, regulatory and structural components, e.g. F0/F1-ATPase. One should not add the term "ATP-binding" to all subunits, but only to those that really bind ATP. This is what is done in UniProt, with regards to the Keyword "ATP-binding".
  10. What about GTP binding that is NOT involved in GTPase activity? This would create a true path violation.

Question 7

Do nothing. Allow curators to use the existing terms that describe 'x binding' and accept that the resulting annotations will not indicate to users whether the gene product binds the molecule as a substrate or cofactor.

Comments:

  1. Working on extended thoughts will be at http://wiki.geneontology.org/index.php/User:JimHu/Binding_terms
  2. Please be aware that I am answering this survey based on gut reactions with very few examples at hand and a very limited knowledge of substrate binding matters.
  3. It is not always straightforward to decide if the molecule is a substrate or cofactor, the author may not give all this information - we would lose a lot of information if we did not annotate these at all. Has the user community been asked about this - I can't see that they would be too perturbed to see that an ATPase had been annotated to ATP binding.
  4. If GTP binding is documented, it should be annotated. That might lead someone to test whether GTP is a substrate or cofactor. And anyone interested in "in vitro kinetics" or testing or assays would certainly like that information, even if it is only IEA.
  5. Annotators should annotate the specific data in a paper and resist the temptation to annotate binding events which are not specifically shown in the paper.
  6. not sure, I could be persuaded to accept this solution
  7. Reading through this it seems like there's a danger the solution might be worse than the problem!