XP:biological process xp chebi

From GO Wiki
Jump to navigation Jump to search

NOTE the logical definitions are moving from their location in the scratch directory on cvs to owl ontologies in the extensions directory on svn. Please refer to Ontology_extensions for a summary of changes.

Synopsis

Logical definitions for GO Biological Process and Molecular Function using CHEBI. The current set is maintained by Mike Bada in the Hunter Group at UCDHSC

Introduction

XPs defining GO terms in terms of chemical entities; example:

 [Term]
 id: GO:0000273
 name: lipoic acid metabolic process
 intersection_of: GO:0008152 ! metabolic process
 intersection_of: has_participant CHEBI:16494 ! lipoic acid

i.e. lipoic acid metabolic process =def a metabolic process that has_participant lipoic acid

Status

No editor has been assigned to this set as yet.

Results so far show a large divergence between CHEBI and the implicit chemical entity ontology in GO

Availability

The logical definitions are separate bridge files in the ontology/extensions/ directory in svn.

The source file (modified by ontology editors and TermGenie is editors/go_xp_chebi.obo

This is translated to http://purl.obolibrary.org/obo/go/extensions/x-chemical.owl every night.

To view the logical definitions in Protege, you are advised to use the importer ontology:

Note the XP:molecular_function_xp_chebi ontology retains its source in cvs for the time being.

Results

Relations used

The UCDHSC cross-products use a large set of relations; See:

Examples of use:

 [Term]
 id: GO:0000316
 name: sulfite transport
 intersection_of: GO:0006810 ! transport
 intersection_of: results_in_directed_movement_of CHEBI:17359 ! sulfite
 [Term]
 id: GO:0051381
 name: histamine binding
 intersection_of: GO:0005488 ! binding
 intersection_of: results_in_joining_of CHEBI:18295 ! histamine

The argument is that the relations are required for providing xp definitions that are necessary and sufficient (all definitions are by definition necessary and sufficient). For example, if we instead said:

 [Term]
 id: GO:0000316
 name: sulfite transport
 intersection_of: GO:0006810 ! transport
 intersection_of: has_participant CHEBI:17359 ! sulfite

This would be wrong as a definition. We need to indicate the role that sulfite plays in transport - i.e. it is the entity that is transported, as opposed to the location that it is transported too, or the entity that initiates the transport.

At the RO meeting some participants were not keen on having a large collection of hard to define relations in RO. Alan Ruttenberg showed an alternative way of defining biochemical processes using a minimal set of relations, OWL class expressions and an ontology of dispositions (i.e. to be a substrate). The approach needs concretized further to compare it with the Bada/UCDHSC method. In the meantime, it was decided that the UCDHSC relations and xps were useful today, so they would live in a separate RO called MACRO; these relations can be expanded to the longer Ruttenberg style expressions, when this has been fleshed out more.

CHEBI terms

On occasion there are entities referenced in GO processes that Mike identified as corresponding to the set-union of classes in CHEBI; for example:

[Term]
id: GO:0015962
name: diadenosine triphosphate metabolic process
intersection_of: GO:0008152 ! metabolism
intersection_of: has_participant anon_chemical_16

[Term]
id: anon_chemical_16
namespace: chebi_ontology
is_anonymous: true
union_of: CHEBI:27775 ! P(1),P(3)-bis(5'-adenosyl)  triphosphate
union_of: CHEBI:27529 ! P(1),P(2)-bis(5'-adenosyl)  triphosphate

GO defines this as The chemical reactions and pathways involving diadenosine triphosphate, a derivative of the nucleoside adenosine with three phosphate groups attached

CHEBI does not have a term diadenosine triphosphate

Misalignments and reasoner results

To see these, load the imports file above, run the rule based reasoner, then "assert implied links"

As of 2009-06-25, there 1117 is_a links that should be added to GO according to CHEBI and 754 is_a links in CHEBI not reflected in multiple places in GO.

The structure of the above report is paired triples - either an is_a relationship in GO for which we would expect an is_a relationship in CHEBI, and conversely an is_a relationship in CHEBI not reflected in GO.

There is some repetition in the reports; for example, there are many entities classified under "heterocycle" in CHEBI for which this is not reflected in metabolism, biosynthesis etc relationships in GO.

Tracker Items

Chebi tracker:

GO tracker:

Remaining work

  • improve the XPs
  • to what extent does is_a-overloading in CHEBI cause erroneous inferences about biological processes? (these concern me even more than the is_a complete classes)
  • can we use abduction / "reverse reasoning" to suggest placement for CHEBI orphans based on the implicit chemical entity hierarchy in GO?
  • does expanding the much-debated "macro" relations used in the current xps into complex OWL expressions (as recommended by Alan Ruttenberg) result in useful inferences?
  • what kinds of interesting biological questions can be answered by the combination of the two resources plus the xps, using either the P4 or obo query interfaces?
  • can we identify areas of CHEBI in which new relations, necessary and sufficient conditions etc will enable inferences of use to GO?


Presentations

From 2007

See also: CJM_EBI_Visit_June_2009#GO-ChEBI_xps