InterPro2GO Session October 4th 2011: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
Line 8: Line 8:
#Problematic areas of InterPro to cover:
#Problematic areas of InterPro to cover:
## When to use particular membrane-associated component mappings (see the recent PAINT paper)
## When to use particular membrane-associated component mappings (see the recent PAINT paper)
## Use of the protein-binding term ('protein binding ; GO:0005515' and 'binding ; GO:0005488' should only be used for annotation when an identifier is present in the with column (cases where the identifier is absent are strippd out of the GOA files): are there more specific terms InterPro can use instead?).
## Use of the protein-binding term ('protein binding ; GO:0005515' and 'binding ; GO:0005488' should only be used for annotation when an identifier is present in the with column (cases where the identifier is absent are stripped out of the GOA files): are there more specific terms InterPro can use instead?).
## How to GO map proteins that form complexes (the relationship ontology might help here)
## How to GO map proteins that form complexes (the relationship ontology might help here)
## GO mapping proteins that have different functions according to the component they're present in.
## GO mapping proteins that have different functions according to the component they're present in.
Line 15: Line 15:
## The idea of using blacklists to prevent erroneous mappings to sequences based on InterPro matches
## The idea of using blacklists to prevent erroneous mappings to sequences based on InterPro matches
### Relating to black-lists, revisit protein-kinase catalytic domain entry (IPR000719), which maps the terms GO:0006468 protein phosphorylation, GO:0004672 protein kinase activity and GO:0005524 ATP binding to ~100K sequences in UniProt. However, among these are members of the tribbles family, which are psuedo-kinases. So are there sensible ways we can handle this kind of situation without sacrificing large numbers of true positive mappings.
### Relating to black-lists, revisit protein-kinase catalytic domain entry (IPR000719), which maps the terms GO:0006468 protein phosphorylation, GO:0004672 protein kinase activity and GO:0005524 ATP binding to ~100K sequences in UniProt. However, among these are members of the tribbles family, which are psuedo-kinases. So are there sensible ways we can handle this kind of situation without sacrificing large numbers of true positive mappings.


==Problematic InterPro Mappings==
==Problematic InterPro Mappings==

Revision as of 08:29, 23 September 2011

A face-to-face meeting at the EBI between GOA curators, GO editors and the InterPro curation team, to go through the InterPro 2 GO mapping process, problematic mappings, relationships between GO terms and InterPro domains etc etc...


Agenda

  1. Jane to give an overview of the multi-organism process node (GO:0051704) in GO, and how to use the terms for annotation
  2. An overview of the Relation Ontology
  3. Problematic areas of InterPro to cover:
    1. When to use particular membrane-associated component mappings (see the recent PAINT paper)
    2. Use of the protein-binding term ('protein binding ; GO:0005515' and 'binding ; GO:0005488' should only be used for annotation when an identifier is present in the with column (cases where the identifier is absent are stripped out of the GOA files): are there more specific terms InterPro can use instead?).
    3. How to GO map proteins that form complexes (the relationship ontology might help here)
    4. GO mapping proteins that have different functions according to the component they're present in.
    5. Component mappings in general (ie, should we be mapping terms based on proteins that are *only* found in a particular location, or do we map proteins that have been observed in that location at some stage?)
    6. Clarification on how GOA use the 'NOT' qualifier - are there implications that we need to be aware of in InterPro?
    7. The idea of using blacklists to prevent erroneous mappings to sequences based on InterPro matches
      1. Relating to black-lists, revisit protein-kinase catalytic domain entry (IPR000719), which maps the terms GO:0006468 protein phosphorylation, GO:0004672 protein kinase activity and GO:0005524 ATP binding to ~100K sequences in UniProt. However, among these are members of the tribbles family, which are psuedo-kinases. So are there sensible ways we can handle this kind of situation without sacrificing large numbers of true positive mappings.

Problematic InterPro Mappings

  • IPR000402. ATPase and ATP metabolism terms.


Minutes

Present:

  • Jane Lomax (GO)
  • Rebecca Foulger (GO)
  • Emily Dimmer (GOA)
  • Alex Mitchell (InterPro curation co-ordinator)


Useful Links