InterPro2GO Session October 4th 2011: Difference between revisions
Jump to navigation
Jump to search
(→Agenda) |
|||
Line 19: | Line 19: | ||
==Problematic InterPro Mappings== | ==Problematic InterPro Mappings== | ||
* IPR000402. ATPase and ATP metabolism terms. [https://sourceforge.net/tracker/?func=detail&atid=605890&aid=3413254&group_id=36855] | * '''IPR000402'''. ATPase and ATP metabolism terms. [https://sourceforge.net/tracker/?func=detail&atid=605890&aid=3413254&group_id=36855] | ||
* IPR000342. Signal transducer activity. [http://sourceforge.net/tracker/?func=detail&aid=3410404&group_id=36855&atid=605890] | * '''IPR000342'''. Signal transducer activity. [http://sourceforge.net/tracker/?func=detail&aid=3410404&group_id=36855&atid=605890] | ||
'''* IPR024738''' | |||
**It represents the Ada1/Tada1 subunit of SAGA-like complex | |||
**The SAGA complex is a transcriptional coactivator (involved in regulation of transcription by RNA polymerase II). Should be map it with: Contributes_to transcription coactivator activity (MF) GO:0003713 following the GO complex annotation guidelines? | |||
* '''IPR018767''' | |||
** Nucleus export protein Brr6. It is mapped to GO:0016021 integral to membrane. Should be instead mapped to: GO:0005635 nuclear envelope? | |||
* '''Ribosomal Proteins''' | |||
**In the database we have many entries for ribosomal proteins, but perhaps we are not mapping them correctly. Understanding better the relationships between MF and BP could help. | |||
**Example: IPR000439 Ribosomal protein L15e. At the moment it is mapped as: | |||
**Process GO:0006412 translation | |||
**Function GO:0003735 structural constituent of ribosome | |||
**Component GO:0005840 ribosome | |||
* '''NOT qualifier''' | |||
**I think the example of the pseudokinase TRIBBLES should be a candidate for the NOT qualifier. It matches IPR000719 and IPR017442, protein kinase domains. And the presence of a kinase domain is characteristic of TRIBBLES, it is only that they have lost their catalytic activity, that’s why they are called pseudokinases. | |||
**A different case is IPR000014. It integrates 3 signatures. One of them, SMART, is the one giving problems as it hits false positives (Q9C1W9). This is a signature for a PAS domain (mapped to signal transduction), and Q9C1W9 is a DNA ligase. SM00091 hits 44433 proteins in total. | |||
==Minutes== | ==Minutes== |
Revision as of 12:23, 3 October 2011
A face-to-face meeting at the EBI between GOA curators, GO editors and the InterPro curation team, to go through the InterPro 2 GO mapping process, problematic mappings, relationships between GO terms and InterPro domains etc etc...
Agenda
- InterPro to give an overview of the InterPro2GO mapping procedure.
- Jane to give an overview of the multi-organism process node (GO:0051704) in GO, and how to use the terms for annotation.
- Jane to give an overview of the relations that are being developed for GO annotations, and how they'll be used, including the membrane terms.
- Problematic areas of InterPro to cover:
- When to use particular membrane-associated component mappings (see the recent PAINT paper)
- Use of the protein-binding term ('protein binding ; GO:0005515' and 'binding ; GO:0005488' should only be used for annotation when an identifier is present in the with column (cases where the identifier is absent are stripped out of the GOA files): are there more specific terms InterPro can use instead?).
- How to GO map proteins that form complexes (the relationship ontology might help here)
- GO mapping proteins that have different functions according to the component they're present in.
- Component mappings in general (ie, should we be mapping terms based on proteins that are *only* found in a particular location, or do we map proteins that have been observed in that location at some stage?)
- Clarification on how GOA use the 'NOT' qualifier - are there implications that we need to be aware of in InterPro?
- The idea of using blacklists to prevent erroneous mappings to sequences based on InterPro matches
- Relating to black-lists, revisit protein-kinase catalytic domain entry (IPR000719), which maps the terms GO:0006468 protein phosphorylation, GO:0004672 protein kinase activity and GO:0005524 ATP binding to ~100K sequences in UniProt. However, among these are members of the tribbles family, which are psuedo-kinases. So are there sensible ways we can handle this kind of situation without sacrificing large numbers of true positive mappings.
Problematic InterPro Mappings
* IPR024738
- It represents the Ada1/Tada1 subunit of SAGA-like complex
- The SAGA complex is a transcriptional coactivator (involved in regulation of transcription by RNA polymerase II). Should be map it with: Contributes_to transcription coactivator activity (MF) GO:0003713 following the GO complex annotation guidelines?
- IPR018767
- Nucleus export protein Brr6. It is mapped to GO:0016021 integral to membrane. Should be instead mapped to: GO:0005635 nuclear envelope?
- Ribosomal Proteins
- In the database we have many entries for ribosomal proteins, but perhaps we are not mapping them correctly. Understanding better the relationships between MF and BP could help.
- Example: IPR000439 Ribosomal protein L15e. At the moment it is mapped as:
- Process GO:0006412 translation
- Function GO:0003735 structural constituent of ribosome
- Component GO:0005840 ribosome
- NOT qualifier
- I think the example of the pseudokinase TRIBBLES should be a candidate for the NOT qualifier. It matches IPR000719 and IPR017442, protein kinase domains. And the presence of a kinase domain is characteristic of TRIBBLES, it is only that they have lost their catalytic activity, that’s why they are called pseudokinases.
- A different case is IPR000014. It integrates 3 signatures. One of them, SMART, is the one giving problems as it hits false positives (Q9C1W9). This is a signature for a PAS domain (mapped to signal transduction), and Q9C1W9 is a DNA ligase. SM00091 hits 44433 proteins in total.
Minutes
Present:
- Jane Lomax (GO)
- Rebecca Foulger (GO)
- Emily Dimmer (GOA)
- Alex Mitchell (InterPro curation co-ordinator)