Protein Complex ids as GO annotation objects

From GO Wiki
Revision as of 06:51, 15 August 2011 by Edimmer (talk | contribs) (Created page with "* the creation of an association between a GO term and a protein complex as the annotations object is appropriate and possible for terms from each of the three GO ontologies * ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • the creation of an association between a GO term and a protein complex as the annotations object is appropriate and possible for terms from each of the three GO ontologies
  • annotations to molecular function terms should not automatically be transferred from a protein complex id to its subunit components. In addition, the issue of how curators apply the 'contributes_to' qualifier to MF terms will need to be resolved, to ensure groups represent this data in the same manner.

Summarizes as:

[1] GP part_of some MC, MC localizes_to some CC ==> GP localizes_to some CC [2] GP part_of some MC, MC has_function_in some BP ==> GP has_function_in some BP

But not:

[3] GP part_of some MC, MC capable_of some MF ==> GP capable_of some MF


Allowing [2] but excluding [3] doesn't seem to have any principled basis. I agree that it's a good rule of thumb, but it can't be a hard and fast rule. Some BPs are just a chain of two MFs, why allow them but nit the single MF? It seems there are probably complexes with multiple units, where some of the units are just hangers-on for smaller BPs. (Chris)

Seems more like a probabilistic rule, with p increasing for larger process and smaller complexes. (Chris)


  • When creating the subunit BP or CC annotation, the involvement of the subunit in the complex needs to be captured somewhere in the annotation line.
  • Possible fields for capturing the protein complex for a subunit's annotation include column 8(with), column 16( annotation extension) or column 17 (Gene Product Form ID). However the most appropriate appears to be Column 17(Gene Product From ID), as column 17 is intended to provide further specification on the specific annotation object in column 2. Currently contains protein accessions, isoform identifiers. If protein complex identifiers were included then the description of this column would need to be widened, however it would still be correct to describe the aims of this field as to provide further details on the object (column 2) of an annotation. As the accepted contents for this field is currently so well defined, there might be resistance from groups to widen its remit. If protein complexes were to be described in this field, perhaps relationship types should appear to clarify how the id in column 17 relates to the id in column 2. Perhaps a relationship such as 'belongs_to', 'member_of' or 'component_of' should be supplied alongside the protein complex identifier.

The clearest way to document this is primarily via the GPAD+GPI formats (Chris)

  • if we decide that the implicit/default gene_product to GO Term relationship is very conservative, such as: [participates_in] for biological process terms or [occurs_in/located_in] for cellular component terms, then no further qualifier for subunit annotations should be needed, especially as enough information to interpret the annotation should be already in column 16/17.

Annotation Example

PMID:19001025 - Calcineurin Complex.

Investigators use a constitutively active mouse Calcineurin complex (deltaCnA and CnB) to induce dephosphorylation of ARC

Page 2270


[mouse calcineurin complex] GO:0016311 dephosphorylation IMP PMID:19001025 [input_id]ARC