Column 16: Function x Process x Component extensions

From GO Wiki
Jump to navigation Jump to search

Process terms can be further specified by subcellular location.

For example: plastid translational elongation

At the time of writing this term is not declared in GO. Again we use the occurs_in relation:

Col 5: GO:0006414 
Col 16: occurs_in(GO:0009536)

Why, you might ask, can we not instead make two annotations to:

   * GO:0032544 ! plastid translation
   * GO:0006414 ! translational elongation 

The answer is that co-annotation carries less information. Computationally we have no way of knowing these two processes are linked.

Note that the majority of the time, BP x CC cross-products should be pre-composed in the ontology. If the above scenario comes up, consider requesting a new term plastid translational elongation rather than using col 16.

Also note that when using a GO ID in col 16, a redundant annotation should sometimes be added.

Functions carried out as part of a process

We use the part_of relation to link function and process (this relation is already used for the inter-ontology links)

For example, if a gene product is observed to have GTPase activity as a part of the nerve growth factor receptor signaling pathway, you would annotate:

 col5: GO:0003924
 col16: part_of(GO:0048011)

Note you should also include a separate annotation in which GO:0048011 is in col5, so that people who are not using col 16 will not be worse off than they are now. See guidelines.

Note that you would not say something like this:

 col5: GO:0016301
 col16: part_of(GO:0016310)
  • GO:0016301 - kinase activity
  • GO:0016310 - phosphorylation

This is harmless but pointless, because we know that kinase activity is part_of phosphorylation from gene_ontology_ext


Function-Process-Component threesomes

col5: GO Function ID
col16: part_of(GO PROCESS ID),occurs_in(GO CC ID)

Also include 2 redundant annotation lines



Why allow GO IDs in col 16? Can I just co-annotate instead

co-annotation is not sufficient. Important information is lost. For example, if a gene has 4 annotations to

  • mitochondrion
  • nucleus
  • translation
  • transport

We have no way of knowing whether the gene is involved in

  • nuclear translation vs mt translation (or both)
  • transport within, to or from cytoplasm or nucleus


What happens when new specific GO terms corresponding to the annotation XPs are added?

Let's say annotator A wishes to annotate to "plastid translational elongation", but there is no such term in GO, because it is (for example) deemed to be not sufficiently different from generic translational elongation.

They should then annotate to "translational elongation" and also put "occurs_in(plastid)" in col16

Then let's say later on we discover that "plastid translational elongation" does belong in GO after all (policy changes or we discover something about the biology), so the term gets added

Crucially, the annotator need do nothing. Their annotation can be automatically mapped forward, once an entry for "plastid translational elongation" is added to XP:biological_process_xp_cellular_component

When do I request a new term and when do I use the annotation xp column?

Request a new term if it seems like a sensible new term to have in GO. Combinatorial terms in GO are fine if it corresponds to a commonly used scientific term, and the combination is not completely arbitrary and accidental.

For more on this important issue, and a discussion of when to pre-composed and when to compose at annotation time, see this thread on the GO list from March 2009: http://fafner.stanford.edu/pipermail/go/2009-March/016501.html

Will this replace existing combinatorial GO terms like "B cell differentiation"

No! It is important to keep terms like this pre-coordinated in the GO.

BP-MF Example

Here is gene 1234 that executes GTPase activity as part of an intracellular signaling cascade

Gene (col 2/3) Term (col 5) Ref (col 6) Ext (col 16)
Gene1234 GO:0003924 ! GTPase activity PMID:nnnn part_of(GO:0007242) ! intracellular signaling cascade
Gene1234 GO:0007242 ! intracellular signaling cascade PMID:nnnn (empty)

BP x CC

  • Gene1234 has a gene product that is involved in plastid translational elongation

At the time of writing this term is not declared in GO. Here we use the occurs_in relation:

Gene (col 2/3) Term (col 5) Ref (col 6) Ext (col 16)
gene1234 GO:0006414 ! translational elongation PMID:nnnn occurs_in(GO:0009536) ! plastid

|- | gene1234 | GO:0009536 ! plastid | PMID:nnnn | |}