Transitive closure
From GO Public
Contents |
[edit] Calculating the transitive closure: the old way
See Schema notes
The old way was to ignore the relation in the GO graph and calculate a blind transitive closure. This works fine so long as GO consists of only a few relations like is_a and part_of. However, it results in false positives when used with regulates. These are not so serious as we had previously been treating regulates as part_of anyway. However, the relation does need to be taken into account for other relations, escpecially has_part
[edit] Calculating the transitive closure: the new way
The new way will take relations and semantics of those relations into account. See Category:Reasoning. This may more accurately be called the deductive closure
3rd party consumers of the GO will have the option of calculating the closure themselves, or using a pre-computed closure. The pre-computed closure will be available as:
- A simple tab-delimited file
- in the graph_path table in the database
There are many advantages to the new system
- will not make erroneous calculations for relations such as has_part
- treats regulates correctly
- scales with new relations
[edit] Tab delimited file
The columns are
- subject GO ID (i.e. child)
- target GO ID (i.e. parent)
- relation ID (e.g. part_of)
- implied or asserted
An example of this file can be found here.
This table can be used to determine the relation between any two nodes in the GO (if a relation holds at all)
The table is generated using obo2linkfile (part of the core OBOEdit2 distribution)
[edit] graph_path table
The graph_path table has been extended to include the relation:
--- @@ graph_path.relationship_type_id
--- References an entry in the term table corresponding
--- to the INFERRED relation that holds between term2 and term1.
--- At this time the value is always NULL - a blind transitive closure
--- is calculated, ignoring the relationship_type_id in term2term.
--- However, in future we want to calculate different closures for
--- different relations. [See
--- ]
relationship_type_id integer,
foreign key (relationship_type_id) references term(id),
This brings the graph_path table more in line with cvtermpath in Chado
A new column relation_distance has been added. The current distance column will remain, and have the same semantics (i.e. number of hops to get from a node (term2) to its descendant (term1), regardless of relation. The new relation_distance column measures the number of hops over the specified relation only.
From the DDL docs:
--- @@graph_path.distance
--- The distance in terms of the number of "hops" between
--- nodes in the asserted graph (term2term).
--- The relationship_type_id is ignored here.
--- Example: if A part_of B is_a C part_of D, then
--- distance=3 for A part_of D
distance integer,
--- @@graph_path.relation_distance
--- (added 2008-10-27)
--- The distance in terms of the number of "hops" over
--- relationship_type_id in the asserted graph (term2term).
--- Example: if A part_of B is_a C part_of D, then
--- relation_distance=2 for A part_of D
relation_distance integer
[edit] Using the closure
TDB with ontology/annotation group
There is an implicit relation between a gene product and a GO term. This has yet to be formalized, an initial sketch is below:
- has_function : for MF
- has_function_in_process : for BP
- has_function_in_location : for CC
Whilst this has yet to be finalized, what is clear is that the BP and CC relations are transitive_over part_of. This means it is valid to propagate the link up both is_a and part_of links.
e.g.
G has_function_in_process A A is_a B B part_of C C is_a D D regulates E E is_a F => A part_of C A part_of D => G has_function_in_process D
When the exact relations are determined we will provide a relation composition table - given two relations, R1 and R2, what do we know about the composition R1 o R2?
[edit] How it's calculated
Currently the blind transitive closure is calculated using perl code in go-db-perl
This code will be retired. There are 2 options for replacement:
- Use the OBO Edit Reasoner
- Use custom perl/SQL code
Using the OE reasoner has various advantages - we can leverage existing code. Also, the the OE reasoner can wrap standard 3rd party reasoners such as Pellet. See Category:Reasoning for more details.
The database would be populated by first running obo2linkfile on the main .obo file to generate the tab-del file above. A new loader script (load-linkfile-into-graph_path.pl) has been written to pull this into the database
(alternatively, OE can write directly to the database)
However, on balance it is likely we will use a lightweight perl/SQL approach. See the script go-db-reasoner.pl in go-db-perl/scripts
This has certain advantages:
- no need for OE configuration in production pipeline
- reasoner can easily be run on existing state of database
- scales with disk space, not memory
[edit] Basic Usage
Bcl2 is annotated to positive regulation of anti-apoptosis (GO:0045768). What is the relation between this term and apoptosis (GO:0006915)?
If we grep the table here (or query the graph_path table in the database) we see:
GO:0045768 indirectly_regulates GO:0006915 implied link GO:0045768 indirectly_negatively_regulates GO:0006915 implied link
[edit] Advanced Usage
See the Relation composition page
