Regulation cross-products: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
 
(23 intermediate revisions by the same user not shown)
Line 1: Line 1:
For more background, see this presentation:
* [[Media:go-sw-report-2006-cambridge.ppt]]
'''Many of the items brought up on this page have been resolved'''.
You should instead look here:
* [[XP:biological_process_xp_regulation]]
The remainder of this page is out of date
The following has been resolved since this page was created:
* We use 3 relations, [[regulates]], [[negatively_regulates]] and [[positively_regulates]]
* [[regulates]] is [[transitive_over]] [[part_of]]
==Regulation cross-products==
==Regulation cross-products==


THESE INSTRUCTIONS ARE A LITTLE OUT OF DATE!!
The oboedit reasoner can be used to automatically manage regulation terms. This requires the generation of [[Internal_cross_products]]. We can do this semi-automatically, because the regulation terms have a consistent simple syntax.


!!!!!
this can serve as a trial run before implementing [[Cross_Product_Guide|cross-products with external ontologies]]
* See this presentation: [[Media:go-sw-report-2006-cambridge.ppt]]
!!!!!


As a first step we only consider ''Regulation of biological process''. For ongoing work see also:


The oboedit reasoner can be used to automatically manage regulation terms; this can serve as a trial run before implementing cross-products with external ontologies.
* [http://www.berkeleybop.org/obol/#biological_process_xp_regulation_of_function-obol Regulation of function]]
* [http://www.berkeleybop.org/obol/#biological_process_xp_regulation_of_quality-obol Regulation of biological quality]]
* [[XP:biological_process_xp_multi_organism_process]] - for interspecies regulation


For now we mainly consider internal regulation terms; ie regulation of biological process.
=== Resources ===


You will need the file IsaComplete-xp.obo, from:
You will need the file gene_ontology_xp.obo, from:
ftp://ftp.geneontology.org/pub/go/scratch


(this file should be updated daily from IsaComplete.obo, provided my cron works)
* [ftp://ftp.geneontology.org/pub/go/scratch go/scratch] directory
 
(this file should be updated daily from gene_ontology_edit.obo, provided my cron works)


You can load this into oboedit. Turn the reasoner on; then read on for an explanation
You can load this into oboedit. Turn the reasoner on; then read on for an explanation


===Logical (cross-product) Definitions===
=== Relations used ===


oboedit needs the definition (necessary and sufficient conditions) of regulation terms to be made explicit rather than buried in text. We do this using the new oboformat1.2 intersection_of (cross-product definition) tag.
* [[regulates]]
* [[negatively_regulates]]
* [[positively_regulates]]
 
== Cross-product Definitions==
 
oboedit needs the xp definition (necessary and sufficient conditions) of regulation terms to be made explicit rather than embedded in text. We do this using the new oboformat1.2 intersection_of (cross-product definition) tag.


The idea is that we define a term like ''negative regulation of synaptogenesis'' as being:
The idea is that we define a term like ''negative regulation of synaptogenesis'' as being:


- A ''negative regulation process'' in which the the process being regulated is ''synpatogenesis''
- A '''negative regulation process''' that ''regulates'' '''synpatogenesis'''


This is an aristotelian (genus-differentia) definition. It can also be seen as the cross-product (intersection) of:
This is an aristotelian (genus-differentia) definition. It can also be seen as the cross-product (intersection) of:


# negative regulation of biological process
# '''negative regulation of biological process'''
# things that regulate synpatogenesis
# things that ''regulate'' '''synpatogenesis'''


Currently we do not have a ''regulates'' relation in GO. To keep things simple, we use the part_of relation for now, on the understanding we will change this to ''regulates'' in the future.
(This also involves introducing a new relation, ''regulates'')


Here is what this looks like in oboformat:
Here is what this looks like in oboformat:
Line 44: Line 69:
   is_a: GO:0051963    ! regulation of synaptogenesis
   is_a: GO:0051963    ! regulation of synaptogenesis
   intersection_of: GO:0048519    ! negative regulation of biological process
   intersection_of: GO:0048519    ! negative regulation of biological process
   intersection_of: part_of GO:0007416    ! synaptogenesis
   intersection_of: regulates GO:0007416    ! synaptogenesis




Here we have added two lines. These lines can be safely ignored by obof1.2 unaware parsers; they can be stripped out prior to making public if need be. However, they provide oboedit (or any other reasoner-aware tool, like Protege/SWOOP) with the information required to automatically manage the placement of these terms in the DAG.
Here we have added two lines. These lines can be safely ignored by obof1.2 unaware parsers; they can be stripped out prior to making public if need be. However, they provide oboedit (or any other reasoner-aware tool, like Protege/SWOOP) with the information required to automatically manage the placement of these terms in the DAG.
== Editing and browsing regulation cross-products ==
=== Oboedit1.1 ===
You can browse and edit the regulation xps in OE1.1
Just load the gene_ontology_xp.obo file from the scratch directory above


In the oboedit cross-product box, this should look like:
In the oboedit cross-product box, this should look like:
Line 54: Line 87:
   Differentia:  part_of synaptogenesis
   Differentia:  part_of synaptogenesis


===Using the reasoner===
=== Oboedit2.0 ===
 
oboedit2 has more advanced features to make editing cross-products easier
 
If you are using OboEdit2, you can you the [http://wiki.geneontology.org/index.php/OBO-Edit:Cross_products Cross Product Matrix Editor]. 


open the file in the reasoner, you should be able to see both missing is_a links (blue squiggly lines) as well as redundant links (straight red lines).
here we see a relative sparse area, around mitosis. Not all combinations are realized. Selecting groups of empty cells and clicking the "Make" button realized the terms in the ontology, placing them correctly in the DAG


[[Image:Regulation-with-reasoner.jpg]]
[[Image:xp-reg-meiotic.jpg]]
 
we can examine a more densely populated area, such as regulation of lymphocyte; clickin on a non-empty cell shows is_a children (pink) and is_a parents (yellow):
 
[[Image:reg-of-lymphocyte.jpg]]
 
=== Using the reasoner ===


here oboedit is saying that '''-reg of chemotaxis''' ''can be inferred to be an is_a child of'' '''-reg of behavior'''
open the file in the reasoner, you should be able to see both missing ''is_a links'' (blue squiggly lines) as well as redundant links (straight red lines).


oboedit can infer this from the logical definition (intersection_of lines). Note that the actual ''is_a'' links are not there in the underlying obo file - this is the ''inferred'' (implied) graph, not the ''asserted'' graph (to see the asserted graph, simply turn off the reasoner; you will see ''-reg of chemotaxis'' disappear as a child of ''-reg of behavior''. You can save the inferred graph as an asserted graph (ie write out all the inferred is_a links) using a special save option (JOHN - HOW?).
[[Image:Regulation-with-reasoner.jpg]]


Or you can use obo2obo; like this:
(See the presentation above for more up-to-date screenshots)


  obo2obo IsaComplete-xp.obo -o -saveimpliedlinks -realizeimpliedlinks IsaComplete-xp-with-implied-links-saved.obo
here oboedit is saying that '''-reg of chemotaxis''' ''can be inferred to be an is_a child of'' '''-reg of behavior'''


(this step is already done; see th scratch directory)
oboedit can infer this from the logical definition (intersection_of lines). Note that the actual ''is_a'' links are not there in the underlying obo file - this is the ''inferred'' (implied) graph, not the ''asserted'' graph (to see the asserted graph, simply turn off the reasoner; you will see ''-reg of chemotaxis'' disappear as a child of ''-reg of behavior''.


Here is another example:
Here is another example:
Line 76: Line 119:
you should be able to see the cross-product definition of the focused term. This term is not asserted to be a child of regulation of cellular process, but this is implied (see the DAG view on the right)
you should be able to see the cross-product definition of the focused term. This term is not asserted to be a child of regulation of cellular process, but this is implied (see the DAG view on the right)


Don't spend too long fixing these yet - this is a trial run. This is too reactive a process to incorporate into production; you want new regulation terms to go straight into the DAG. The next step will be to actually maintain the logical definitions in the gene_ontology_edit.obo file (but strip them from the public one since they are obof1.2). I'll send details later in the week. The basic idea is you will add regulation terms providing the minimal information and oboedit will do everything else.
Note the blue lines mean the reasoner knows the link should be there, but it has not been asserted.
 
In OE2 you can use the "assert implied links" option to fill in missing links:
 
[[Image:xp-assert-implied.jpg]]


Note that for now I've simply used the part_of relation in the logical definition - we can discuss whether this is a good time to move to regulates.


===How does this work===
== How was this generated? ==


How does this all work? I have a simplified version of obol that takes a term
How does this all work? I have a simplified version of obol that takes a term
Line 89: Line 135:


   genus: {-/+} regulation of biological process
   genus: {-/+} regulation of biological process
   differentia: part_of X
   differentia: regulates X


in obo format this is
in obo format this is


   intersection_of: GO:ID_for_{-/+}_regulation_of_BP
   intersection_of: GO:ID_for_{-/+}_regulation_of_BP
   intersection_of: part_of GO:ID_for_X
   intersection_of: regulates GO:ID_for_X


the oboedit reasoner does the rest
the oboedit reasoner does the rest


===Next steps===
== Next steps ==
 
We can continue with the process of generating the regulation-xps externally, and periodically running the reasoner to fill in missing links
 
However, it would be better to start managing the xps directly in the gene_ontology_edit.obo file. New regulation terms would be created using the cross-product interface, and placed in the DAG automatically.
 
== Reasoner details ==
 
Given the correct xp definitions, the reasoner can place the regulation terms correctly in the DAG
 
=== Skipped links ===
 
Not all intermediate links must be filled in the regulation DAG:
 
[[Image:stem-cell-maintenance.png]]
 
=== Transitivity over part_of ===
 
We must have consistent rules about what to do with regulation terms for cases where the regulated stand in part_of relations:
 
[[Image:regulation-of-immune.png]]
 
* 2008/01/14 : David/Tanya/Chris : decided regulation is always transitive over part_of. This implies RoMF and RoBP should not be disjoint
 
=== Links that cannot be implied ===
 
We can also look at regulation is_a links that '''cannot''' be implied by the reasoner:
 
[[Image:regulation-of-cell-adhesion-abductive.png]]
 


Currently this is an experimental trial which should hopefully help the IsaComplete process.
=== Function to process links ===


The next step could be to place the intersection_of lines directly in the main IsaComplete.obo file, rather than the separate IsaComplete-xp.obo file.
=== Regulation of biological quality ===


We could then move these into gene_ontology_edit.obo; we will need to check that the conversion process to gene_ontology.obo works fine.


Once the genus-differentia defs move to the live file, it will become best practice to maintain the genus-differentia links directly in the file (though an obol process could help check for errors). This may seem like extra work but in theory it should amount to a lot less. Adding a new regulation term (or 3) should simply be a matter of clicking on the core process term and making a few clicks. oboedit should do the rest. More epxloratory work required here.
[[Category:Cross Products]]


Eventually it would be good to have an excel style table for xp editing in oboedit; imagine a table with 3 rows (reg, +reg, -reg) and a columns for core processes (perhaps limited to a branch of the DAG); each cell filled in with the ID of the xp term. New terms added simply by clicking on the empty cell; names, synonyms and defs populated automatically.
[[Category:Regulation]]

Latest revision as of 16:21, 19 January 2009

For more background, see this presentation:

Many of the items brought up on this page have been resolved.

You should instead look here:

The remainder of this page is out of date

The following has been resolved since this page was created:

Regulation cross-products

The oboedit reasoner can be used to automatically manage regulation terms. This requires the generation of Internal_cross_products. We can do this semi-automatically, because the regulation terms have a consistent simple syntax.

this can serve as a trial run before implementing cross-products with external ontologies

As a first step we only consider Regulation of biological process. For ongoing work see also:

Resources

You will need the file gene_ontology_xp.obo, from:

(this file should be updated daily from gene_ontology_edit.obo, provided my cron works)

You can load this into oboedit. Turn the reasoner on; then read on for an explanation

Relations used

Cross-product Definitions

oboedit needs the xp definition (necessary and sufficient conditions) of regulation terms to be made explicit rather than embedded in text. We do this using the new oboformat1.2 intersection_of (cross-product definition) tag.

The idea is that we define a term like negative regulation of synaptogenesis as being:

- A negative regulation process that regulates synpatogenesis

This is an aristotelian (genus-differentia) definition. It can also be seen as the cross-product (intersection) of:

  1. negative regulation of biological process
  2. things that regulate synpatogenesis

(This also involves introducing a new relation, regulates)

Here is what this looks like in oboformat:

 [Term]
 id: GO:0051964
 name: negative regulation of synaptogenesis
 namespace: biological_process
 def: "Any process that stops, prevents or reduces the frequency, rate or extent of synaptogenesis, the formation of a   synapse." []
 is_a: GO:0051961     ! negative regulation of nervous system development
 is_a: GO:0051963     ! regulation of synaptogenesis
 intersection_of: GO:0048519     ! negative regulation of biological process
 intersection_of: regulates GO:0007416     ! synaptogenesis


Here we have added two lines. These lines can be safely ignored by obof1.2 unaware parsers; they can be stripped out prior to making public if need be. However, they provide oboedit (or any other reasoner-aware tool, like Protege/SWOOP) with the information required to automatically manage the placement of these terms in the DAG.

Editing and browsing regulation cross-products

Oboedit1.1

You can browse and edit the regulation xps in OE1.1

Just load the gene_ontology_xp.obo file from the scratch directory above

In the oboedit cross-product box, this should look like:

 Genus: negative regulation of biological process
 Differentia:  part_of synaptogenesis

Oboedit2.0

oboedit2 has more advanced features to make editing cross-products easier

If you are using OboEdit2, you can you the Cross Product Matrix Editor.

here we see a relative sparse area, around mitosis. Not all combinations are realized. Selecting groups of empty cells and clicking the "Make" button realized the terms in the ontology, placing them correctly in the DAG

we can examine a more densely populated area, such as regulation of lymphocyte; clickin on a non-empty cell shows is_a children (pink) and is_a parents (yellow):

Using the reasoner

open the file in the reasoner, you should be able to see both missing is_a links (blue squiggly lines) as well as redundant links (straight red lines).

(See the presentation above for more up-to-date screenshots)

here oboedit is saying that -reg of chemotaxis can be inferred to be an is_a child of -reg of behavior

oboedit can infer this from the logical definition (intersection_of lines). Note that the actual is_a links are not there in the underlying obo file - this is the inferred (implied) graph, not the asserted graph (to see the asserted graph, simply turn off the reasoner; you will see -reg of chemotaxis disappear as a child of -reg of behavior.

Here is another example:

you should be able to see the cross-product definition of the focused term. This term is not asserted to be a child of regulation of cellular process, but this is implied (see the DAG view on the right)

Note the blue lines mean the reasoner knows the link should be there, but it has not been asserted.

In OE2 you can use the "assert implied links" option to fill in missing links:


How was this generated?

How does this all work? I have a simplified version of obol that takes a term

 {-/+} regulation of X

And creates a logical definition

 genus: {-/+} regulation of biological process
 differentia: regulates X

in obo format this is

 intersection_of: GO:ID_for_{-/+}_regulation_of_BP
 intersection_of: regulates GO:ID_for_X

the oboedit reasoner does the rest

Next steps

We can continue with the process of generating the regulation-xps externally, and periodically running the reasoner to fill in missing links

However, it would be better to start managing the xps directly in the gene_ontology_edit.obo file. New regulation terms would be created using the cross-product interface, and placed in the DAG automatically.

Reasoner details

Given the correct xp definitions, the reasoner can place the regulation terms correctly in the DAG

Skipped links

Not all intermediate links must be filled in the regulation DAG:

Transitivity over part_of

We must have consistent rules about what to do with regulation terms for cases where the regulated stand in part_of relations:

  • 2008/01/14 : David/Tanya/Chris : decided regulation is always transitive over part_of. This implies RoMF and RoBP should not be disjoint

Links that cannot be implied

We can also look at regulation is_a links that cannot be implied by the reasoner:


Function to process links

Regulation of biological quality