Transitive closure
Calculating the transitive closure: the old way
See Schema notes
The old way was to ignore the relation in the GO graph and calculate a blind transitive closure. This works fine so long as GO consists of only a few relations like is_a and part_of. However, it results in false positives when used with regulates. These are not so serious as we had previously been treating regulates as part_of anyway. However, the relation does need to be taken into account for other relations, escpecially has_part
Calculating the transitive closure: the new way
The new way will take relations and semantics of those relations into account. See Category:Reasoning. This may more accurately be called the deductive closure
3rd party consumers of the GO will have the option of calculating the closure themselves, or using a pre-computed closure. The pre-computed closure will be available as:
- A simple tab-delimited file
- in the graph_path table in the database
Tab delimited file
The columns are
- subject GO ID (i.e. child)
- target GO ID (i.e. parent)
- relation ID (e.g. part_of)
- implied or asserted
An example of this file can be found here.
This table can be used to determine the relation between any two nodes in the GO (if a relation holds at all)
The table is generated using obo2linkfile (part of the core OBOEdit2 distribution)
graph_path table
The graph_path table has been extended to include the relation:
--- @@ graph_path.relationship_type_id --- References an entry in the term table corresponding --- to the INFERRED relation that holds between term2 and term1. --- At this time the value is always NULL - a blind transitive closure --- is calculated, ignoring the relationship_type_id in term2term. --- However, in future we want to calculate different closures for --- different relations. [See --- ] relationship_type_id integer, foreign key (relationship_type_id) references term(id),
This brings the graph_path table more in line with cvtermpath in Chado
A new column relation_distance has been added. The current distance column will remain, and have the same semantics (i.e. number of hops to get from a node (term2) to its descendant (term1), regardless of relation. The new relation_distance column measures the number of hops over the specified relation only.
From the DDL docs:
--- @@graph_path.distance --- The distance in terms of the number of "hops" between --- nodes in the asserted graph (term2term). --- The relationship_type_id is ignored here. --- Example: if A part_of B is_a C part_of D, then --- distance=3 for A part_of D distance integer,
--- @@graph_path.relation_distance --- (added 2008-10-27) --- The distance in terms of the number of "hops" over --- relationship_type_id in the asserted graph (term2term). --- Example: if A part_of B is_a C part_of D, then --- relation_distance=2 for A part_of D relation_distance integer
Using the closure
TDB with ontology/annotation group
There is an implicit relation between a gene product and a GO term. This has yet to be formalized, an initial sketch is below:
- has_function : for MF
- has_function_in_process : for BP
- has_function_in_location : for CC
Whilst this has yet to be finalized, what is clear is that the BP and CC relations are transitive_over part_of. This means it is valid to propagate the link up both is_a and part_of links.
e.g.
G has_function_in_process A A is_a B B part_of C C is_a D D regulates E E is_a F => A part_of C A part_of D => G has_function_in_process D
When the exact relations are determined we will provide a relation composition table - given two relations, R1 and R2, what do we know about the composition R1 o R2?
How it's calculated
Currently the blind transitive closure is calculated using perl code in go-db-perl
This code will be retired - instead we will use a reasoner; most likely the OboEdit reasoner. See the OE reasoner paper (Category:Reasoning - the OE reasoner has the option of wrapping other standard 3rd party reasoners. See also the Relation composition page
The database will be populated by first running obo2linkfile on the main .obo file to generate the tab-del file above. A new loader script (load-linkfile-into-graph_path.pl) has been written to pull this into the database
(alternatively, OE can write directly to the database; we will probably go with the more loosely coupled approach at first)
Basic Usage
Bcl2 is annotated to positive regulation of anti-apoptosis (GO:0045768). What is the relation between this term and apoptosis (GO:0006915)?
If we grep the table here (or query the graph_path table in the database) we see:
GO:0045768 indirectly_regulates GO:0006915 implied link GO:0045768 indirectly_negatively_regulates GO:0006915 implied link
Advanced Usage
See the Relation composition page