Relation composition: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
Line 37: Line 37:
rules can be applied in any order (e.g. in the second reduction we reduced M phase [[is_a]] cell cycle phase [[is_a]] cell cycle process)
rules can be applied in any order (e.g. in the second reduction we reduced M phase [[is_a]] cell cycle phase [[is_a]] cell cycle process)


We can also infer the same link from the following asserted links:
# mitosis [[part_of]] M phase of mitotic cell cycle [[part_of]] mitotic cell cycle [[is_a]] cell cycle


=== rules for regulates ===
=== rules for regulates ===

Revision as of 17:47, 3 July 2008

This page describes the relation composition rules for relations used in GO. See the OBO Edit Reasoner paper on google docs for background.

See also the [Relation Ontology] and accompanying paper

Simple composition rules

rules for is_a and part_of

TODO: fill in examples

Basic transitivity compositions:

For example:

mitosis is_a cell cycle phase is_a cell cycle process, THEREFORE mitosis is_a cell cycle process

The following rules arise from the definitions give in the OBO Relation Ontology

For example, starting with:

mitosis part_of M phase of mitotic cell cycle is_a M phase is_a cell cycle phase is_a cell cycle process part_of cell cycle

We can iteratively reduce this by repearted application of composition rules:

  1. mitosis part_of M phase is_a cell cycle phase is_a cell cycle process part_of cell cycle
  2. mitosis part_of M phase is_a cell cycle process part_of cell cycle
  3. mitosis part_of cell cycle process part_of cell cycle
  4. mitosis part_of cell cycle

rules can be applied in any order (e.g. in the second reduction we reduced M phase is_a cell cycle phase is_a cell cycle process)

We can also infer the same link from the following asserted links:

  1. mitosis part_of M phase of mitotic cell cycle part_of mitotic cell cycle is_a cell cycle

rules for regulates

With the addition of the regulates relations in GO, the composition rules expand.

First the standard interaction with is_a:

  • is_a . R → R transitivity under is_a
  • R . is_a → R transitivity over is_a

In the above R stands for any of: regulates, negatively_regulates, positively_regulates

Note that regulates is not itself transitive, but we may wish to include a weaker transitive relation (see below)

Note that positively and negatively regulates are sub-relations of regulates; i.e.

  • IF: X negatively_regulates Y
  • THEN: X regulates Y

The regulates relations are transitive over part_of; i.e.

Note that this rule is not hard-coded - it is declared in the gene_ontology .obo file, in the stanza for regulates (see the transitive_over tag)

rules involving gene products

THIS SECTION REQUIRES MORE BACKGROUND INFORMATION


Most of the time we talk of the relation between gene products and GO terms informally as one of "annotated_to". As we expand the relations used in GO, we need to be more precise.

Additional relations required for formalization:

  • has_function_in - between a protein and a MF or BP (as specified in an annotation). Potentially also between a CC and an MF.
  • localized_to - between a protein and a CC.

Here's how it works. If you have two links (annotations or ontology links)

a R1 b b R2 c

And you want to know the relation (if any) between a and c, look up the composition R1.R2 in the table. Row first, then column (seems most intuitive? Could be transposed if required)

For example, if you have

a positively_regulates b, b part_of c

Lookup (R+,P) in the table - the cell value is R+ (i.e. the regulates relations are transitive_over part_of)

Composition is recursive, e.g.: a R1 b, b R2 c, c R3 d => a ((R1.R2).R3) d

Which means you look up R1.R2 first, take the result, then plug that in as the row and look under the R3 column.

If you get a red X, you know something is wrong (remember we have defined regulates as holding between processes; we can generalize so that we can say a gene prouct is regulated, though it may be better to introduce a different but similar relation)

If you get a -/? then you have a legal relation, just one we have so far declined to name. There is nothing to stop us naming for example "indirectly_regulates" (remember we have declared regulates as intransitive)

It's important to name the links between gene products and what is denoted by GO terms, this allows us to give consistent coherent explanations of why we propagate certain things up the DAG by default. For example, we don't propagate over part_of just because it feels warm and fuzzy. It's because L.P=>L and F.P=>F.

Say we have a gene product p directly annotated to a. a is in BP, so the implicit relation is has_function_in (F). The user queries for e (a MF)

If the ontology has:

a is_a b part_of c regulates d is_a e

(this is post BP->MF links)

The full path is:

p has_function_in a is_a b part_of c regulates d is_a e

Should the tool return p? (Here 'tool' can be generalized to amigo queries, map2slim, enrichment calculation etc.)

According to the table there is no name for the relation that holds between p and e. The tool should not include p in the results since there is nothing we can say about how p relates to the query. This is in accord with what we have been saying about how tools should work with the regulates relation. However, there may be circumstances where we want to allow this propagation to occur, but not in an ad-hoc fashion.

If we like, we can name the composition of P.R e.g. "part_of_regulation_of", PR for short. We can also name the composition F.PR - say "functions_as_part_of_regulation_of" or FPR for short (our table starts getting a bit more complex but that's OK). The composition F.I.P.R.I is reduced to FPR.

This means the tool has a concrete basis for offering the user options for how the gene product is propagated. For example, it could say "no gene products are annotated as *having the function* e. Do you want to extend your search to include products that *function as part of the regulation of* e?

Of course tools could also just have a checkbox of relations to propagate over too: but this doesn't take into account the fact that that certain orderings have different semantics.

If we name the relations then this makes it easier for people using the table of implied relations in GO: Transitive_closure#Calculating_the_transitive_closure:_the_new_way

(of course we won't precompute every gene product to every term, just every meaningful term-term relation. The final composition is done without the table)

David and Tanya proposed the following extension to the table:

  • A (F) B (R) C= A is a regulator of C
  • A (F) B (R+) C=A is a positive regulator of C
  • A (F) B (R-) C= A is a negative regulator of C
  • A (P) B (F) C= A contributes_to C
  • A (P) B (R)C= A (R) C this assumes that the other parts of B will occur
  • A (P) B (R+) C= A (R+) C this assumes that the other parts of B will occur
  • A (P) B (R-) C= A (R-) C this assumes that the other parts of B will occur
  • A (R) B (R) C= A indirectly_regulates C
  • A (R) B (R+) C= A indirectly_regulates C
  • A (R) B (R-) C= A indirectly_regulates C
  • A (R+) B (R) C= A indirectly_regulates C
  • A (R+) B (R+) C= A indirectly_positively_regulates C
  • A (R+) B (R-) C= A indirectly_negatively_regulates C
  • A (R-) B (R) C= A indirectly_regulates C
  • A (R-) B (R+) C= A indirectly_negatively_regulates C
  • A (R-) B (R-) C= A indirectly_positively_regulates C
  • A (L) B (F) C= A may contribute_to C

Has_part

See has_part page

Example of relation composition

This example assumes that amongst our annotations we have:

  • MGI Bcl2 - (direct/asserted) annotation to positive regulation of anti-apoptosis
  • RGD Apoe - (direct/asserted) annotation to anti-apoptosis

For the sake of the example, we assume that these are the only annotations that were created for these genes. We ignore evidence codes here (assuming they are trusted annotations)

This page uses oboedit to illustrate the relationships between the gene produts and different kinds of process. It may seem odd to view annotations in OE, but according to our formalism the links between proteins and the processes they participate in are not a different kind of beast from the other kinds of links in GO. Still, we'll hopefully have this in AmiGO too shortly.

You can get the subset of GO used to make these screenshots here:

The full transitive closure is here:

It should also be possible to do queries using the OE2 link search box too - e.g. ask for genes that bear some relation to apoptosis and get back "Bcl2 negative_regulator_of GO:apoptosis". However, the link search doesn't appear to be working properly in conjunction with the reasoner - Amina is working on this.