XP:biological process xp chebi: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
(New page: == Presentations == * http://www.slideshare.net/cmungall/go-chebi-xp-intro From 2007 Category:Cross Products)
 
 
(32 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''NOTE''' the logical definitions are moving from their location in the scratch directory on cvs to owl ontologies in the extensions directory on svn. Please refer to [[Ontology_extensions]] for a summary of changes.
== Synopsis ==
Logical definitions for GO Biological Process and Molecular Function using CHEBI. The current set is maintained by Mike Bada in the Hunter Group at UCDHSC
== Introduction ==
XPs defining GO terms in terms of chemical entities; example:
  [Term]
  id: GO:0000273
  name: lipoic acid metabolic process
  intersection_of: GO:0008152 ! metabolic process
  intersection_of: has_participant CHEBI:16494 ! lipoic acid
i.e. ''lipoic acid metabolic process'' =<sub>def</sub> a '''metabolic process''' ''that'' [[has_participant]]  '''lipoic acid'''
== Status ==
No editor has been assigned to this set as yet.
Results so far show a large divergence between CHEBI and the implicit chemical entity ontology in GO
== Availability ==
The logical definitions are separate bridge files in the [http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/ontology/extensions/ ontology/extensions/] directory in svn.
The source file (modified by ontology editors and [[TermGenie]] is
[http://viewvc.geneontology.org/viewvc/GO-SVN/trunk/ontology/editors/go_xp_chebi.obo editors/go_xp_chebi.obo]
This is translated to http://purl.obolibrary.org/obo/go/extensions/x-chemical.owl every night.
To view the logical definitions in Protege, you are advised to use the importer ontology:
* http://purl.obolibrary.org/obo/go/extensions/x-chemical-importer.owl
Note the [[XP:molecular_function_xp_chebi]] ontology retains its source in cvs for the time being.
== Results ==
=== Relations used ===
The UCDHSC cross-products use a large set of relations; See:
* [http://www.berkeleybop.org/ontologies/obo-all/ro_ucdhsc/ro_ucdhsc.obo.html ro_ucdhsc] (or the [http://www.berkeleybop.org/ontologies/obo-all/ro_ucdhsc/ro_ucdhsc.obo source])
Examples of use:
  [Term]
  id: GO:0000316
  name: sulfite transport
  intersection_of: GO:0006810 ! transport
  intersection_of: results_in_directed_movement_of CHEBI:17359 ! sulfite
  [Term]
  id: GO:0051381
  name: histamine binding
  intersection_of: GO:0005488 ! binding
  intersection_of: results_in_joining_of CHEBI:18295 ! histamine
The argument is that the relations are required for providing xp definitions that are necessary ''and sufficient'' (all definitions are by definition necessary and sufficient). For example, if we instead said:
  [Term]
  id: GO:0000316
  name: sulfite transport
  intersection_of: GO:0006810 ! transport
  intersection_of: has_participant CHEBI:17359 ! sulfite
This would be wrong as a definition. We need to indicate the ''role'' that sulfite plays in transport - i.e. it is the entity that is transported, as opposed to the location that it is transported too, or the entity that initiates the transport.
At the [http://www.bioontology.org/wiki/index.php/OntologyRelations RO meeting] some participants were not keen on having a large collection of hard to define relations in RO. Alan Ruttenberg showed an alternative way of defining biochemical processes using a minimal set of relations, OWL class expressions and an ontology of dispositions (i.e. to be a substrate). The approach needs concretized further to compare it with the Bada/UCDHSC method. In the meantime, it was decided that the UCDHSC relations and xps were useful ''today'', so they would live in a separate RO called MACRO; these relations can be ''expanded'' to the longer Ruttenberg style expressions, when this has been fleshed out more.
=== CHEBI terms ===
On occasion there are entities referenced in GO processes that Mike identified as corresponding to the set-union of classes in CHEBI; for example:
<pre>
[Term]
id: GO:0015962
name: diadenosine triphosphate metabolic process
intersection_of: GO:0008152 ! metabolism
intersection_of: has_participant anon_chemical_16
[Term]
id: anon_chemical_16
namespace: chebi_ontology
is_anonymous: true
union_of: CHEBI:27775 ! P(1),P(3)-bis(5'-adenosyl)  triphosphate
union_of: CHEBI:27529 ! P(1),P(2)-bis(5'-adenosyl)  triphosphate
</pre>
GO defines this as ''The chemical reactions and pathways involving diadenosine triphosphate, a derivative of the nucleoside adenosine with three phosphate groups attached''
CHEBI does not have a term '''diadenosine triphosphate'''
=== Misalignments and reasoner results ===
* Reasoner results: [http://www.geneontology.org/scratch/xps/biological_process_xp_chebi-newlinks.txt biological_process_xp_chebi-newlinks.txt]
To see these, load the imports file above, run the rule based reasoner, then "assert implied links"
* Misalignments: [http://www.geneontology.org/scratch/xps/biological_process_xp_chebi-misalign.txt scratch/xps/biological_process_xp_chebi-misalign.txt]
As of 2009-06-25, there 1117 is_a links that should be added to GO according to CHEBI and 754 is_a links in CHEBI not reflected in multiple places in GO.
The structure of the above report is paired triples - either an is_a relationship in GO for which we would expect an is_a relationship in CHEBI, and conversely an is_a relationship in CHEBI not reflected in GO.
There is some repetition in the reports; for example, there are many entities classified under "heterocycle" in CHEBI for which this is not reflected in metabolism, biosynthesis etc relationships in GO.
=== Tracker Items ===
Chebi tracker:
* [http://sourceforge.net/tracker/index.php?func=detail&aid=1695784&group_id=36855&atid=440764  roles and the is_a relation]
GO tracker:
* [http://sourceforge.net/tracker/?func=detail&aid=2812369&group_id=36855&atid=440764 misalignments]
=== Remaining work ===
* improve the XPs
* to what extent does is_a-overloading in CHEBI cause erroneous inferences about biological processes? (these concern me even more than the is_a complete classes)
* can we use abduction / "reverse reasoning" to suggest placement for CHEBI orphans based on the implicit chemical entity hierarchy in GO?
* does expanding the much-debated "macro" relations used in the current xps into complex OWL expressions (as recommended by Alan Ruttenberg) result in useful inferences?
* what kinds of interesting biological questions can be answered by the combination of the two resources plus the xps, using either the P4 or obo query interfaces?
* can we identify areas of CHEBI in which new relations, necessary and sufficient conditions etc will enable inferences of use to GO?


== Presentations ==
== Presentations ==
Line 5: Line 134:


From 2007
From 2007
See also: [[CJM_EBI_Visit_June_2009#GO-ChEBI_xps]]


[[Category:Cross Products]]
[[Category:Cross Products]]
[[Category:Chemical Entities]]

Latest revision as of 18:46, 23 May 2012

NOTE the logical definitions are moving from their location in the scratch directory on cvs to owl ontologies in the extensions directory on svn. Please refer to Ontology_extensions for a summary of changes.

Synopsis

Logical definitions for GO Biological Process and Molecular Function using CHEBI. The current set is maintained by Mike Bada in the Hunter Group at UCDHSC

Introduction

XPs defining GO terms in terms of chemical entities; example:

 [Term]
 id: GO:0000273
 name: lipoic acid metabolic process
 intersection_of: GO:0008152 ! metabolic process
 intersection_of: has_participant CHEBI:16494 ! lipoic acid

i.e. lipoic acid metabolic process =def a metabolic process that has_participant lipoic acid

Status

No editor has been assigned to this set as yet.

Results so far show a large divergence between CHEBI and the implicit chemical entity ontology in GO

Availability

The logical definitions are separate bridge files in the ontology/extensions/ directory in svn.

The source file (modified by ontology editors and TermGenie is editors/go_xp_chebi.obo

This is translated to http://purl.obolibrary.org/obo/go/extensions/x-chemical.owl every night.

To view the logical definitions in Protege, you are advised to use the importer ontology:

Note the XP:molecular_function_xp_chebi ontology retains its source in cvs for the time being.

Results

Relations used

The UCDHSC cross-products use a large set of relations; See:

Examples of use:

 [Term]
 id: GO:0000316
 name: sulfite transport
 intersection_of: GO:0006810 ! transport
 intersection_of: results_in_directed_movement_of CHEBI:17359 ! sulfite
 [Term]
 id: GO:0051381
 name: histamine binding
 intersection_of: GO:0005488 ! binding
 intersection_of: results_in_joining_of CHEBI:18295 ! histamine

The argument is that the relations are required for providing xp definitions that are necessary and sufficient (all definitions are by definition necessary and sufficient). For example, if we instead said:

 [Term]
 id: GO:0000316
 name: sulfite transport
 intersection_of: GO:0006810 ! transport
 intersection_of: has_participant CHEBI:17359 ! sulfite

This would be wrong as a definition. We need to indicate the role that sulfite plays in transport - i.e. it is the entity that is transported, as opposed to the location that it is transported too, or the entity that initiates the transport.

At the RO meeting some participants were not keen on having a large collection of hard to define relations in RO. Alan Ruttenberg showed an alternative way of defining biochemical processes using a minimal set of relations, OWL class expressions and an ontology of dispositions (i.e. to be a substrate). The approach needs concretized further to compare it with the Bada/UCDHSC method. In the meantime, it was decided that the UCDHSC relations and xps were useful today, so they would live in a separate RO called MACRO; these relations can be expanded to the longer Ruttenberg style expressions, when this has been fleshed out more.

CHEBI terms

On occasion there are entities referenced in GO processes that Mike identified as corresponding to the set-union of classes in CHEBI; for example:

[Term]
id: GO:0015962
name: diadenosine triphosphate metabolic process
intersection_of: GO:0008152 ! metabolism
intersection_of: has_participant anon_chemical_16

[Term]
id: anon_chemical_16
namespace: chebi_ontology
is_anonymous: true
union_of: CHEBI:27775 ! P(1),P(3)-bis(5'-adenosyl)  triphosphate
union_of: CHEBI:27529 ! P(1),P(2)-bis(5'-adenosyl)  triphosphate

GO defines this as The chemical reactions and pathways involving diadenosine triphosphate, a derivative of the nucleoside adenosine with three phosphate groups attached

CHEBI does not have a term diadenosine triphosphate

Misalignments and reasoner results

To see these, load the imports file above, run the rule based reasoner, then "assert implied links"

As of 2009-06-25, there 1117 is_a links that should be added to GO according to CHEBI and 754 is_a links in CHEBI not reflected in multiple places in GO.

The structure of the above report is paired triples - either an is_a relationship in GO for which we would expect an is_a relationship in CHEBI, and conversely an is_a relationship in CHEBI not reflected in GO.

There is some repetition in the reports; for example, there are many entities classified under "heterocycle" in CHEBI for which this is not reflected in metabolism, biosynthesis etc relationships in GO.

Tracker Items

Chebi tracker:

GO tracker:

Remaining work

  • improve the XPs
  • to what extent does is_a-overloading in CHEBI cause erroneous inferences about biological processes? (these concern me even more than the is_a complete classes)
  • can we use abduction / "reverse reasoning" to suggest placement for CHEBI orphans based on the implicit chemical entity hierarchy in GO?
  • does expanding the much-debated "macro" relations used in the current xps into complex OWL expressions (as recommended by Alan Ruttenberg) result in useful inferences?
  • what kinds of interesting biological questions can be answered by the combination of the two resources plus the xps, using either the P4 or obo query interfaces?
  • can we identify areas of CHEBI in which new relations, necessary and sufficient conditions etc will enable inferences of use to GO?


Presentations

From 2007

See also: CJM_EBI_Visit_June_2009#GO-ChEBI_xps