Transitive closure
Calculating the transitive closure: the old way
See Schema notes
The old way was to ignore the relation in the GO graph and calculate a blind transitive closure. This works fine so long as GO consists of only a few relations like is_a and part_of. However, it results in false positives when used with regulates. These are not so serious as we had previously been treating regulates as part_of anyway. However, the relation does need to be taken into account for other relations, escpecially has_part
Calculating the transitive closure: the new way
The new way will take relations and semantics of those relations into account. See Category:Reasoning. This may more accurately be called the deductive closure
3rd party consumers of the GO will have the option of calculating the closure themselves, or using a pre-computed closure. The pre-computed closure will be available as:
- A simple tab-delimited file
- in the graph_path table in the database
There are many advantages to the new system
- will not make erroneous calculations for relations such as has_part
- treats regulates correctly
- scales with new relations
Tab delimited file
The columns are
- subject GO ID (i.e. child)
- target GO ID (i.e. parent)
- relation ID (e.g. part_of)
- implied or asserted
An example of this file can be found here.
This table can be used to determine the relation between any two nodes in the GO (if a relation holds at all)
The table is generated using obo2linkfile (part of the core OBOEdit2 distribution)
graph_path table
The graph_path table has been extended to include the relation:
--- @@ graph_path.relationship_type_id --- References an entry in the term table corresponding --- to the INFERRED relation that holds between term2 and term1. --- At this time the value is always NULL - a blind transitive closure --- is calculated, ignoring the relationship_type_id in term2term. --- However, in future we want to calculate different closures for --- different relations. [See --- ] relationship_type_id integer, foreign key (relationship_type_id) references term(id),
This brings the graph_path table more in line with cvtermpath in Chado
A new column relation_distance has been added. The current distance column will remain, and have the same semantics (i.e. number of hops to get from a node (term2) to its descendant (term1), regardless of relation. The new relation_distance column measures the number of hops over the specified relation only.
From the DDL docs:
--- @@graph_path.distance --- The distance in terms of the number of "hops" between --- nodes in the asserted graph (term2term). --- The relationship_type_id is ignored here. --- Example: if A part_of B is_a C part_of D, then --- distance=3 for A part_of D distance integer,
--- @@graph_path.relation_distance --- (added 2008-10-27) --- The distance in terms of the number of "hops" over --- relationship_type_id in the asserted graph (term2term). --- Example: if A part_of B is_a C part_of D, then --- relation_distance=2 for A part_of D relation_distance integer
Using the closure
TDB with ontology/annotation group
There is an implicit relation between a gene product and a GO term. This has yet to be formalized, an initial sketch is below:
- has_function : for MF
- has_function_in_process : for BP
- has_function_in_location : for CC
Whilst this has yet to be finalized, what is clear is that the BP and CC relations are transitive_over part_of. This means it is valid to propagate the link up both is_a and part_of links.
e.g.
G has_function_in_process A A is_a B B part_of C C is_a D D regulates E E is_a F => A part_of C A part_of D => G has_function_in_process D
When the exact relations are determined we will provide a relation composition table - given two relations, R1 and R2, what do we know about the composition R1 o R2?
How it's calculated
Currently the blind transitive closure is calculated using perl code in go-db-perl
This code will be retired. There are 2 options for replacement:
- Use the OBO Edit Reasoner
- Use custom perl/SQL code
Using the OE reasoner has various advantages - we can leverage existing code. Also, the the OE reasoner can wrap standard 3rd party reasoners such as Pellet. See Category:Reasoning for more details.
The database would be populated by first running obo2linkfile on the main .obo file to generate the tab-del file above. A new loader script (load-linkfile-into-graph_path.pl) has been written to pull this into the database
(alternatively, OE can write directly to the database)
However, on balance it is likely we will use a lightweight perl/SQL approach. See the script go-db-reasoner.pl in go-db-perl/scripts
This has certain advantages:
- no need for OE configuration in production pipeline
- reasoner can easily be run on existing state of database
- scales with disk space, not memory
Basic Usage
Bcl2 is annotated to positive regulation of anti-apoptosis (GO:0045768). What is the relation between this term and apoptosis (GO:0006915)?
If we grep the table here (or query the graph_path table in the database) we see:
GO:0045768 indirectly_regulates GO:0006915 implied link GO:0045768 indirectly_negatively_regulates GO:0006915 implied link
Advanced Usage
See the Relation composition page