Improving protein binding annotations using InterPro domains: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 43: Line 43:
!! number of mappings
!! number of mappings


====Data to become available=====
====Data to become available====


Shortly to be come available as a two column file from QuickGO
Shortly to be come available as a two column file from QuickGO


<!! show excerpt? >
<!! show excerpt? >


==== Use by UniProt-GOA ====
==== Use by UniProt-GOA ====

Revision as of 06:01, 26 June 2012

Current Situation

  • curators are faced with multiple choices of GO terms under GO:0005515; protein binding, with over 900 child terms that describe different aspects of the interaction:

This includes, descriptions of the:

  1. the protein class/family of the interactor: TBP-class protein binding
  2. the role/activity of the interactor: kinase binding
  3. the dependancies of the interaction: copper-dependent protein binding
  4. the state of the interacting protein phosphoprotein binding
  5. the domain being bound in the interactor: MADS box domain binding
  6. the function the interaction contributes towards sterol regulatory element binding protein import into nucleus involved in sterol depletion response
  • all these different ways of describing the interaction mean it is possible to describe an interaction in many different ways, and makes it less likely for the curator to be able to annotate consistently and comprehensively. However different curators feel strongly as to the usefulness of different, diverse terms.

Moving forward

  • Many curators would like to keep more descriptive terms under protein binding that describe roles/activities (e.g. London 2011 GOC meeting)
  • Ideally, curators would be able to annotate to a protein binding term that indicated its functional relevance, e.g. ‘protein binding involved in heterotypic cell-cell adhesion’
  • Perhaps the second best option, would be to indicate the type of protein being bound provides more information to users. This might also help curators search for the pieces of information to enable them to make the annotation to 'protein binding involved in BP X', users to infer this possibility if it is not strong enough to be included directly in the annotation.


Example:


<<include!>>>

UniProt-GOA student project to improve protein binding: Marijn Berg.

Question:

Can the information we have on the identity of interactors be used to help curators make a decision as to what GO term under 'GO:00005515; protein binding', could be used to improve current annotation web displays?

Work Carried out:

Using InterPro family groupings, and the GO annotation attached to the protein interactors to supply curators with more specific GO term suggestions


!! number of mappings

Data to become available

Shortly to be come available as a two column file from QuickGO

<!! show excerpt? >

Use by UniProt-GOA

- to be included initially as a curator suggestion in Protein2GO.

- the decisions that curators make (whether to use the suggestion/improve upon it/reject it) will be captured and assessed



- after 6 months/sufficient data captured, an analysis of the data will determine whether the file can be used to automatically improve the GO term attached to existing protein binding annotations (e.g. from IntAct - where there are ~18,000 high-quality interactions which only apply 'protein binding' or 'protein self binding' GO terms.

- Possibily the first type of positive annotation suggestions for the curator to be included in protein2go; where the suggestion offerred should be high-quality but not yet the correctness of a production IEA method.

Identified issues

  • there are some very descriptive binding terms - whereas in other places, little information is available - e.g. protease binding, no oxidoreductase activity.
  • there will be cases where >1 term is suggested. In some cases it will be reasonable to capture both in annotations, in others - this could lead to a discussion as to the desirability of certain terms, e.g. glycoprotein binding



Going Forward

1. Use these suggestions

2. Improvements to terms

3. GO to GO?

4. InterPro domain binding - add into c.16?




Other ways to improve the display of Protein binding annotations

  • A discussion on the web display of mproving the display of GO annotations to indclude the protein name of the interactor, rather than the gene/gene product identifier - however a large number of proteins have an obscure name and users will not be able to use the hierarchy structure of GO.
  • LEGO might be able to give us a more of a network view - so that users can easily move from viewing the activities of one protein, to those of interacting ones. However, LEGO will take time to be formalized, the curation tool available and a display made available. GO annotations will continue to be displayed in a list format for individual proteins on many database web pages for years to come. Marijn's project is looking to easily improve the information content to users _now_