Protein complexes: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 19: Line 19:


- Indicate cellular location as specifically as possible, unless parent already has one
- Indicate cellular location as specifically as possible, unless parent already has one
[Birgit] Yes! And the CC is for the complex as a whole. We discussed this in the context of transmembrane complexes with members that are only located on one side of the membrane or have no membrane attachment at all. As gene products have the part_of relationship with the complexes this is apparently fine (and the only way of reflecting the CC for the complex)


- How to request protein complexes in GO based on the above (TG template, TG freeform)
- How to request protein complexes in GO based on the above (TG template, TG freeform)
Line 25: Line 27:
http://wiki.geneontology.org/index.php/Protein_Complex_ids_as_GO_annotation_objects
http://wiki.geneontology.org/index.php/Protein_Complex_ids_as_GO_annotation_objects


- What IntAct is doing:
[Birgit] Inheritance of annotations:
I agree with the wiki, you cannot inherit MF from a complex to a subunit and even a CC is problematic, see the transmembrane example above. This needs more thinking about. I don't know what you are doing right now...
 
Orthologies:
We infer within taxon groups, e.g. human to mouse to rat or any other mammal etc, depending on where the exp evidence comes from. We systematically infer human-mouse. We have a few pombe complexes inferred from yeast (Sc!) but we don't do it systematically.
 
Paralogues:
We make inferences between related complexes in the same species when the gene products are very similar, e.g. hemoglobin chains for adult and developmental complexes.
 
'Large' complexes:
We have tackled the 'mediator' and we can now link to RNACentral for RNAs so time permitting we'll tackle the 'biggies' soon!
 
Pro:
We have a list of Pro complexes that we consult for refs.
 
- What IntAct is doing - a summary:


We didn't draw up an official set of rules but in summary this is what we do (and it pretty much matches what Paola says below and the wiki she cites):
[Birgit] We didn't draw up an official set of rules but in summary this is what we do (and it pretty much matches what Paola says below and the wiki she cites):
A complex should be taxon agnostic but may be restricted to certain taxonomic groups, such as pro- vs eukaryotes.
A complex should be taxon agnostic but may be restricted to certain taxonomic groups, such as pro- vs eukaryotes.
... should contain subunits in the def
... should contain subunits in the def

Revision as of 16:49, 11 March 2015

  • NOTE: This is a work in progress. It needs to be wrapped up, and revised by editors, Becky and Birgit. Also, we need to add examples - what works and what doesn't.

- How do we view protein complexes in GO. The complex should be stable. If not stable, it's just protein binding. I'm sure we had something written down for this - Birgit? Do GO and IntAct guidelines agree on this.

[Birgit] Yes, we, complexes curators, concentrate on stable complexes. They should have experimental evidence with regards to their subunit composition AND function in vivo (see desirable MF links above). NB: Our new editor now allows for adding PMID and ECO codes for the GO xrefs so in future we can export thse more easily :)

The Complex Portal could also hold transient complexes, e.g. signaling complexes that form for only split seconds but have some experimental evidence that they exist. We haven't done any of these but they are possible.

We can also curate complexes that have no full experimental evidence but are commonly regarded as truly real, e.g. complexes submitted by ChEMBL for which we only have pharmacological evidence. These complexes are tagged with ECO:0000306 - inferred from background scientific knowledge by manual assertion

- GO should host species-agnostic complexes, ideally conserved across taxa. Where this isn't known, still make the def generic, and add 'For example, in human this complex contains...' as a def gloss or def comment. Species-specific complexes don't belong in GO, but rather in IntAct and/or PRO (or just IntAct?).

[Birgit] Yes, our complexes are all species-specific but we make the GO terms agnostic (s. above)

- Ideally, add capable_of functions link. If not possible, see if capable_of_part_of process links can be made. If none is applicable, we do host complexes based on their subunits only.

[Birgit] Yes, see summary above.

- Indicate cellular location as specifically as possible, unless parent already has one

[Birgit] Yes! And the CC is for the complex as a whole. We discussed this in the context of transmembrane complexes with members that are only located on one side of the membrane or have no membrane attachment at all. As gene products have the part_of relationship with the complexes this is apparently fine (and the only way of reflecting the CC for the complex)

- How to request protein complexes in GO based on the above (TG template, TG freeform)

- Emily started documentation here, in case it's helpful, but this wasn't worked on since 2011: http://wiki.geneontology.org/index.php/Protein_Complex_ids_as_GO_annotation_objects

[Birgit] Inheritance of annotations: I agree with the wiki, you cannot inherit MF from a complex to a subunit and even a CC is problematic, see the transmembrane example above. This needs more thinking about. I don't know what you are doing right now...

Orthologies: We infer within taxon groups, e.g. human to mouse to rat or any other mammal etc, depending on where the exp evidence comes from. We systematically infer human-mouse. We have a few pombe complexes inferred from yeast (Sc!) but we don't do it systematically.

Paralogues: We make inferences between related complexes in the same species when the gene products are very similar, e.g. hemoglobin chains for adult and developmental complexes.

'Large' complexes: We have tackled the 'mediator' and we can now link to RNACentral for RNAs so time permitting we'll tackle the 'biggies' soon!

Pro: We have a list of Pro complexes that we consult for refs.

- What IntAct is doing - a summary:

[Birgit] We didn't draw up an official set of rules but in summary this is what we do (and it pretty much matches what Paola says below and the wiki she cites): A complex should be taxon agnostic but may be restricted to certain taxonomic groups, such as pro- vs eukaryotes. ... should contain subunits in the def ... should have a 'as precise as possible' part_of relationship to the CC (may have to create new terms here as well of course!) which can be a complex (in cases of subcomplexes) or a location ... have, if possible, capable_of and capable_of_part_of annotation extensions. ... should have is_a relationship to an appropriate child term of 'protein complex'. This could be a term based on it's composition or function but NOT based on the PB. If no appropriate term exists, we create one based on either of the two classes. There is now a TG template for creating complex-by-MF which make curators' life much easier :) If there is no appropriate CC or complex-by-MF parent the new complex will be a direct child of 'protein complex'.