AmiGO:Inter-ontology links

From GO Public

Jump to: navigation, search

Currently in GO there is no explicitly asserted link between:

CC - GO:0000118 histone deacetylase complex MF - GO:0004407 histone deacetylase activity BP - GO:0016575 histone deacetylation

Clearly the function, process and components denoted by these terms are inter-related: the CC executes the MF, the MF catalyses the BP

The parts of a whole do not necessarily inherit the function of the whole; the whole does not inherit the function of the parts; and the sibling parts of a whole do not necessarily share the same function. These kinds of rules can be stated formally so that there is less room for confusion (just like the true path rule).

I suspect that one reason annotators may be tempted to make the erroneous transitive inference and transfer the function of the whole (complex) to the part (gene product) is because there is a perceived loss of information in *not* doing so.

For example, if correct curation protocol is followed, then SIF2 should not be annotated to HD Activity (MF), only to HD complex (CC). Searches for the MF "HD Activity" will exclude SIF2. This is correct behavior. However, it may be useful to have some intuitive way of navigating from a search on "HD activity" to SIF2, by means of the complex, so long as it is obvious that SIF2 does not inherit the function of the complex.

Using the latest results from Obol, we can now link terms across GO ontologies. Links between CC and MF the relation would be labeled something like 'executes' or simply 'has function'. In a tree-type display we might show:


  [i] GO:0019213 deacetylase activity
    [i] GO:0033558 protein deacetylase activity
     [i] GO:0004407 histone deacetylase activity    [RPD3]
      [X] GO:0000118 histone deacetylase complex    [SIF2, SPCC1235.09]
       [i] GO:0000508 Rpd3L complex                 [RPD3]
       [i] GO:0000509 Rpd3S complex
       [i] GO:0032221 Clr6 histone deacetylase complex

Image:HD-amigo.jpg

This display correctly represents the biology, but the danger here is that over the years we have built up an expectation in our users that the relation label can be ignored and gene products can be propagated up the DAG, willy-nilly. The correct way to read the DAG above is:

 SIF2 is localized_to HD complex,
 HD complex has_function HD activity

And we can infer

 SIF2 is localized_to some complex that has_function deacetylase
 activity

But we *cannot* infer anything about the activity of SIF2 without further evidence. We would not propagate SIF2 up in slimmers, term enrichment, gene product count summaries or any other graph based operation (a curator *may* apply their expertise and decide to make contributes_to annotations based on these CC to MF links, but this would not be automatic).

This means we have to be careful about how we release these (valuable) cross-ontology links to the public, and ensure they are not abused. From a software perspective we are almost ready to load these kinds of links and start showing them in AmiGO, but we should proceed carefully to make sure these kinds of relations are better understood both within GO and outside.

This seems to be related to the contributes_to issue. Is this worth discussing in the same slot at the GO meeting?

The (unvetted) CC to MF links are in cvs:


Personal tools