XP:biological process xp chebi: Difference between revisions
(New page: == Presentations == * http://www.slideshare.net/cmungall/go-chebi-xp-intro From 2007 Category:Cross Products) |
|||
(32 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
'''NOTE''' the logical definitions are moving from their location in the scratch directory on cvs to owl ontologies in the extensions directory on svn. Please refer to [[Ontology_extensions]] for a summary of changes. | |||
== Synopsis == | |||
Logical definitions for GO Biological Process and Molecular Function using CHEBI. The current set is maintained by Mike Bada in the Hunter Group at UCDHSC | |||
== Introduction == | |||
XPs defining GO terms in terms of chemical entities; example: | |||
[Term] | |||
id: GO:0000273 | |||
name: lipoic acid metabolic process | |||
intersection_of: GO:0008152 ! metabolic process | |||
intersection_of: has_participant CHEBI:16494 ! lipoic acid | |||
i.e. ''lipoic acid metabolic process'' =<sub>def</sub> a '''metabolic process''' ''that'' [[has_participant]] '''lipoic acid''' | |||
== Status == | |||
No editor has been assigned to this set as yet. | |||
Results so far show a large divergence between CHEBI and the implicit chemical entity ontology in GO | |||
== Availability == | |||
The logical definitions are separate bridge files in the [http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/ontology/extensions/ ontology/extensions/] directory in svn. | |||
The source file (modified by ontology editors and [[TermGenie]] is | |||
[http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/ontology/editors/go_xp_chebi.obo editors/go_xp_chebi.obo] | |||
This is translated to http://purl.obolibrary.org/obo/go/extensions/x-chemical.owl every night. | |||
To view the logical definitions in Protege, you are advised to use the importer ontology: | |||
* http://purl.obolibrary.org/obo/go/extensions/x-chemical-importer.owl | |||
Note the [[XP:molecular_function_xp_chebi]] ontology retains its source in cvs for the time being. | |||
== Results == | |||
=== Relations used === | |||
The UCDHSC cross-products use a large set of relations; See: | |||
* [http://www.berkeleybop.org/ontologies/obo-all/ro_ucdhsc/ro_ucdhsc.obo.html ro_ucdhsc] (or the [http://www.berkeleybop.org/ontologies/obo-all/ro_ucdhsc/ro_ucdhsc.obo source]) | |||
Examples of use: | |||
[Term] | |||
id: GO:0000316 | |||
name: sulfite transport | |||
intersection_of: GO:0006810 ! transport | |||
intersection_of: results_in_directed_movement_of CHEBI:17359 ! sulfite | |||
[Term] | |||
id: GO:0051381 | |||
name: histamine binding | |||
intersection_of: GO:0005488 ! binding | |||
intersection_of: results_in_joining_of CHEBI:18295 ! histamine | |||
The argument is that the relations are required for providing xp definitions that are necessary ''and sufficient'' (all definitions are by definition necessary and sufficient). For example, if we instead said: | |||
[Term] | |||
id: GO:0000316 | |||
name: sulfite transport | |||
intersection_of: GO:0006810 ! transport | |||
intersection_of: has_participant CHEBI:17359 ! sulfite | |||
This would be wrong as a definition. We need to indicate the ''role'' that sulfite plays in transport - i.e. it is the entity that is transported, as opposed to the location that it is transported too, or the entity that initiates the transport. | |||
At the [http://www.bioontology.org/wiki/index.php/OntologyRelations RO meeting] some participants were not keen on having a large collection of hard to define relations in RO. Alan Ruttenberg showed an alternative way of defining biochemical processes using a minimal set of relations, OWL class expressions and an ontology of dispositions (i.e. to be a substrate). The approach needs concretized further to compare it with the Bada/UCDHSC method. In the meantime, it was decided that the UCDHSC relations and xps were useful ''today'', so they would live in a separate RO called MACRO; these relations can be ''expanded'' to the longer Ruttenberg style expressions, when this has been fleshed out more. | |||
=== CHEBI terms === | |||
On occasion there are entities referenced in GO processes that Mike identified as corresponding to the set-union of classes in CHEBI; for example: | |||
<pre> | |||
[Term] | |||
id: GO:0015962 | |||
name: diadenosine triphosphate metabolic process | |||
intersection_of: GO:0008152 ! metabolism | |||
intersection_of: has_participant anon_chemical_16 | |||
[Term] | |||
id: anon_chemical_16 | |||
namespace: chebi_ontology | |||
is_anonymous: true | |||
union_of: CHEBI:27775 ! P(1),P(3)-bis(5'-adenosyl) triphosphate | |||
union_of: CHEBI:27529 ! P(1),P(2)-bis(5'-adenosyl) triphosphate | |||
</pre> | |||
GO defines this as ''The chemical reactions and pathways involving diadenosine triphosphate, a derivative of the nucleoside adenosine with three phosphate groups attached'' | |||
CHEBI does not have a term '''diadenosine triphosphate''' | |||
=== Misalignments and reasoner results === | |||
* Reasoner results: [http://www.geneontology.org/scratch/xps/biological_process_xp_chebi-newlinks.txt biological_process_xp_chebi-newlinks.txt] | |||
To see these, load the imports file above, run the rule based reasoner, then "assert implied links" | |||
* Misalignments: [http://www.geneontology.org/scratch/xps/biological_process_xp_chebi-misalign.txt scratch/xps/biological_process_xp_chebi-misalign.txt] | |||
As of 2009-06-25, there 1117 is_a links that should be added to GO according to CHEBI and 754 is_a links in CHEBI not reflected in multiple places in GO. | |||
The structure of the above report is paired triples - either an is_a relationship in GO for which we would expect an is_a relationship in CHEBI, and conversely an is_a relationship in CHEBI not reflected in GO. | |||
There is some repetition in the reports; for example, there are many entities classified under "heterocycle" in CHEBI for which this is not reflected in metabolism, biosynthesis etc relationships in GO. | |||
=== Tracker Items === | |||
Chebi tracker: | |||
* [http://sourceforge.net/tracker/index.php?func=detail&aid=1695784&group_id=36855&atid=440764 roles and the is_a relation] | |||
GO tracker: | |||
* [http://sourceforge.net/tracker/?func=detail&aid=2812369&group_id=36855&atid=440764 misalignments] | |||
=== Remaining work === | |||
* improve the XPs | |||
* to what extent does is_a-overloading in CHEBI cause erroneous inferences about biological processes? (these concern me even more than the is_a complete classes) | |||
* can we use abduction / "reverse reasoning" to suggest placement for CHEBI orphans based on the implicit chemical entity hierarchy in GO? | |||
* does expanding the much-debated "macro" relations used in the current xps into complex OWL expressions (as recommended by Alan Ruttenberg) result in useful inferences? | |||
* what kinds of interesting biological questions can be answered by the combination of the two resources plus the xps, using either the P4 or obo query interfaces? | |||
* can we identify areas of CHEBI in which new relations, necessary and sufficient conditions etc will enable inferences of use to GO? | |||
== Presentations == | == Presentations == | ||
Line 5: | Line 134: | ||
From 2007 | From 2007 | ||
See also: [[CJM_EBI_Visit_June_2009#GO-ChEBI_xps]] | |||
[[Category:Cross Products]] | [[Category:Cross Products]] | ||
[[Category:Chemical Entities]] |
Latest revision as of 18:46, 23 May 2012
NOTE the logical definitions are moving from their location in the scratch directory on cvs to owl ontologies in the extensions directory on svn. Please refer to Ontology_extensions for a summary of changes.
Synopsis
Logical definitions for GO Biological Process and Molecular Function using CHEBI. The current set is maintained by Mike Bada in the Hunter Group at UCDHSC
Introduction
XPs defining GO terms in terms of chemical entities; example:
[Term] id: GO:0000273 name: lipoic acid metabolic process intersection_of: GO:0008152 ! metabolic process intersection_of: has_participant CHEBI:16494 ! lipoic acid
i.e. lipoic acid metabolic process =def a metabolic process that has_participant lipoic acid
Status
No editor has been assigned to this set as yet.
Results so far show a large divergence between CHEBI and the implicit chemical entity ontology in GO
Availability
The logical definitions are separate bridge files in the ontology/extensions/ directory in svn.
The source file (modified by ontology editors and TermGenie is editors/go_xp_chebi.obo
This is translated to http://purl.obolibrary.org/obo/go/extensions/x-chemical.owl every night.
To view the logical definitions in Protege, you are advised to use the importer ontology:
Note the XP:molecular_function_xp_chebi ontology retains its source in cvs for the time being.
Results
Relations used
The UCDHSC cross-products use a large set of relations; See:
Examples of use:
[Term] id: GO:0000316 name: sulfite transport intersection_of: GO:0006810 ! transport intersection_of: results_in_directed_movement_of CHEBI:17359 ! sulfite
[Term] id: GO:0051381 name: histamine binding intersection_of: GO:0005488 ! binding intersection_of: results_in_joining_of CHEBI:18295 ! histamine
The argument is that the relations are required for providing xp definitions that are necessary and sufficient (all definitions are by definition necessary and sufficient). For example, if we instead said:
[Term] id: GO:0000316 name: sulfite transport intersection_of: GO:0006810 ! transport intersection_of: has_participant CHEBI:17359 ! sulfite
This would be wrong as a definition. We need to indicate the role that sulfite plays in transport - i.e. it is the entity that is transported, as opposed to the location that it is transported too, or the entity that initiates the transport.
At the RO meeting some participants were not keen on having a large collection of hard to define relations in RO. Alan Ruttenberg showed an alternative way of defining biochemical processes using a minimal set of relations, OWL class expressions and an ontology of dispositions (i.e. to be a substrate). The approach needs concretized further to compare it with the Bada/UCDHSC method. In the meantime, it was decided that the UCDHSC relations and xps were useful today, so they would live in a separate RO called MACRO; these relations can be expanded to the longer Ruttenberg style expressions, when this has been fleshed out more.
CHEBI terms
On occasion there are entities referenced in GO processes that Mike identified as corresponding to the set-union of classes in CHEBI; for example:
[Term] id: GO:0015962 name: diadenosine triphosphate metabolic process intersection_of: GO:0008152 ! metabolism intersection_of: has_participant anon_chemical_16 [Term] id: anon_chemical_16 namespace: chebi_ontology is_anonymous: true union_of: CHEBI:27775 ! P(1),P(3)-bis(5'-adenosyl) triphosphate union_of: CHEBI:27529 ! P(1),P(2)-bis(5'-adenosyl) triphosphate
GO defines this as The chemical reactions and pathways involving diadenosine triphosphate, a derivative of the nucleoside adenosine with three phosphate groups attached
CHEBI does not have a term diadenosine triphosphate
Misalignments and reasoner results
- Reasoner results: biological_process_xp_chebi-newlinks.txt
To see these, load the imports file above, run the rule based reasoner, then "assert implied links"
- Misalignments: scratch/xps/biological_process_xp_chebi-misalign.txt
As of 2009-06-25, there 1117 is_a links that should be added to GO according to CHEBI and 754 is_a links in CHEBI not reflected in multiple places in GO.
The structure of the above report is paired triples - either an is_a relationship in GO for which we would expect an is_a relationship in CHEBI, and conversely an is_a relationship in CHEBI not reflected in GO.
There is some repetition in the reports; for example, there are many entities classified under "heterocycle" in CHEBI for which this is not reflected in metabolism, biosynthesis etc relationships in GO.
Tracker Items
Chebi tracker:
GO tracker:
Remaining work
- improve the XPs
- to what extent does is_a-overloading in CHEBI cause erroneous inferences about biological processes? (these concern me even more than the is_a complete classes)
- can we use abduction / "reverse reasoning" to suggest placement for CHEBI orphans based on the implicit chemical entity hierarchy in GO?
- does expanding the much-debated "macro" relations used in the current xps into complex OWL expressions (as recommended by Alan Ruttenberg) result in useful inferences?
- what kinds of interesting biological questions can be answered by the combination of the two resources plus the xps, using either the P4 or obo query interfaces?
- can we identify areas of CHEBI in which new relations, necessary and sufficient conditions etc will enable inferences of use to GO?
Presentations
From 2007
See also: CJM_EBI_Visit_June_2009#GO-ChEBI_xps