OBO: 1.3 Whiteboard: Difference between revisions
Line 50: | Line 50: | ||
Where any of the term ids may be replaced with another post-comp expression, and parenthesis can be used in postcomp expressions to show precedence. | Where any of the term ids may be replaced with another post-comp expression, and parenthesis can be used in postcomp expressions to show precedence. | ||
====Annotation Extension==== | |||
This extension allows specifies a new, compact syntax for describing annotations as ontology instances. Out of laziness, I'll let an email I sent about it act as our temporary specification: | |||
<pre> | |||
OBO annotations are an extension to the OBO file format that will | |||
give us a succinct, but completely correct, way of representing | |||
annotations as ontology instances. | |||
The idea is that an annotation is an instance that posits some | |||
relationship between other ontology objects. For example, someone | |||
might annotate a gene to a Gene Ontology term in the following way: | |||
flybase_gene:300458382 -occurs_in-> endoplasmic_reticulum | |||
If we extend OBO format to allow terms and instances to have | |||
relationships TO OTHER RELATIONSHIPS (which will be supported in OBO | |||
1.3), we could correctly model the statement above as an instance: | |||
[Instance] | |||
id: my_annotation:1 | |||
instance_of: oban:annotation | |||
relationship: posits flybase_gene:300458382 -occurs_in-> | |||
endoplasmic_reticulum | |||
relationship: based_on_evidence pubmed:3039942 | |||
But this representation is cumbersome and difficult to understand. | |||
The new annotation format introduces a new kind of stanza to | |||
represent our annotation: | |||
[Annotation] | |||
id: my_annotation:1 | |||
subject: flybase_gene:300458382 | |||
relation: occurs_in | |||
object: endoplasmic_reticulum | |||
evidence: pubmed:3039942 | |||
Note that the OBO annotation format simply specifies a mapping | |||
between these new annotation stanzas and instance stanzas. We're not | |||
introducing any new OBO semantics - this is just syntactic sugar. | |||
We're also extending the datamodel libraries in OBO-Edit to provide a | |||
programming API that gives programmers access to the benefits of this | |||
new syntax. For example, the datamodel contains a new Annotation | |||
object that has getSubject(), getObject(), setSubject(), setObject(), | |||
etc methods. The Annotation object is just an extension of the OBO- | |||
Edit Instance object, so any calls to these new Annotation methods | |||
are automatically mapped into calls to Instance methods. | |||
I'm about to start working with Chris to produce a draft | |||
specification for OBO 1.3, so this will be spelled out in much | |||
greater detail then. I hope this brief introduction was useful - | |||
please let me know if there are any details you'd like filled in. | |||
</pre> | |||
The specifics of these Annotation stanzas are largely up in the air, but our current prototype supports the following tags: | |||
* subject | |||
* relationship | |||
* object | |||
* assigned_by | |||
* evidence | |||
* source | |||
Of these, only subject, relationship and object are particularly well defined. For any Annotation with a subject, object and relationship specified, the mapping works like this: | |||
<pre> | |||
! This annotation... | |||
[Annotation] | |||
id: <id> | |||
subject: <subject_id> | |||
relationship: <relationship_id> | |||
object: <object_id> | |||
!is equivalent to this instance... | |||
[Instance] | |||
id: <id> | |||
instance_of: oban:annotation | |||
relationship: oban:posits <subject_id> -<relationship_id>-> <object_id> | |||
</pre> | |||
The other tags (assigned_by, evidence, and source) have no well-specified meaning (as far as I know) at this point. | |||
You'll notice that the mapping relies on some pre-defined ontology objects. Those objects are defined in an ontology file that currently exists as a resource in the OBO-Edit source repository, but will probably be moved to the OBO foundry soon. The contents of oban.obo are reproduced below: | |||
<pre> | |||
default-namespace: oban | |||
[Term] | |||
id: oban:annotation | |||
name: Annotation | |||
[Typedef] | |||
id: oban:has_data_source | |||
name: has data source | |||
domain: oban:annotation | |||
[Typedef] | |||
id: oban:has_evidence | |||
name: has evidence | |||
domain: oban:annotation | |||
range: oban:evidence | |||
[Term] | |||
id: oban:evidence | |||
name: Evidence | |||
[Typedef] | |||
id: oban:posits | |||
name: posits | |||
domain: oban:annotation | |||
</pre> |
Revision as of 14:00, 3 October 2007
This page is intended to be a scratch pad for OBO 1.3 features and proposals.
Changes to Relations
The following new Typedef tags should be supported in OBO 1.3...
- transitive_under
- always_implies_inverse
transitive_under means that p -transitive_under-> q and X -q-> Y and Y -p-> Z, then X -p-> Y. Note that every relation is automatically transitive_over and transitive_under IS_A
always_implies_inverse is a boolean property. If always_implies_inverse is true for a relation p, it means that if p -inverse_of-> q and X -p-> Y, then Y -q-> X. This property could be used to define an integral_part_of relationship in OBO_REL, for example:
[Typedef] id: OBO_REL:part_of name: part of inverse_of: has_part [Typedef] id: OBO_REL:has_part name: has part [Typedef] id: OBO_REL:integral_part_of is_a: OBO_REL:part_of always_implies_inverse: true
Further, OBO 1.3 will allow classes to specify relationships to other terms OR to links between other terms. Link identifiers are specified in the following form:
child_term_id -relation_id-> parent_term_id
This new feature may not be part of the main specification, but may need to be specified in an ancillary parser extension specification (see below).
General Specification Changes
The OBO 1.3 specification needs to discuss the concept of parser extensions. Parser extensions are optional addenda to the basic OBO 1.3 specification that provide additional features to the OBO language.
Parser extensions require that we add a new header tag to OBO files called requires_extension. The requires_extension tag should specify both an identifier for the required extension (so we need to figure out how we specify that) and a minimum version number for that extension.
Extensions
There are at least two new extensions to OBO 1.3:
Postcomp Extension
This extension allows specially formatted post-composition expressions to be substituted for most identifier references in an OBO file. The post-composition expressions have the following format:
genus_term_id^differentia_type_id(differentia_term_id) [^differentia_type_id(differentia_term_id)]*
Where any of the term ids may be replaced with another post-comp expression, and parenthesis can be used in postcomp expressions to show precedence.
Annotation Extension
This extension allows specifies a new, compact syntax for describing annotations as ontology instances. Out of laziness, I'll let an email I sent about it act as our temporary specification:
OBO annotations are an extension to the OBO file format that will give us a succinct, but completely correct, way of representing annotations as ontology instances. The idea is that an annotation is an instance that posits some relationship between other ontology objects. For example, someone might annotate a gene to a Gene Ontology term in the following way: flybase_gene:300458382 -occurs_in-> endoplasmic_reticulum If we extend OBO format to allow terms and instances to have relationships TO OTHER RELATIONSHIPS (which will be supported in OBO 1.3), we could correctly model the statement above as an instance: [Instance] id: my_annotation:1 instance_of: oban:annotation relationship: posits flybase_gene:300458382 -occurs_in-> endoplasmic_reticulum relationship: based_on_evidence pubmed:3039942 But this representation is cumbersome and difficult to understand. The new annotation format introduces a new kind of stanza to represent our annotation: [Annotation] id: my_annotation:1 subject: flybase_gene:300458382 relation: occurs_in object: endoplasmic_reticulum evidence: pubmed:3039942 Note that the OBO annotation format simply specifies a mapping between these new annotation stanzas and instance stanzas. We're not introducing any new OBO semantics - this is just syntactic sugar. We're also extending the datamodel libraries in OBO-Edit to provide a programming API that gives programmers access to the benefits of this new syntax. For example, the datamodel contains a new Annotation object that has getSubject(), getObject(), setSubject(), setObject(), etc methods. The Annotation object is just an extension of the OBO- Edit Instance object, so any calls to these new Annotation methods are automatically mapped into calls to Instance methods. I'm about to start working with Chris to produce a draft specification for OBO 1.3, so this will be spelled out in much greater detail then. I hope this brief introduction was useful - please let me know if there are any details you'd like filled in.
The specifics of these Annotation stanzas are largely up in the air, but our current prototype supports the following tags:
- subject
- relationship
- object
- assigned_by
- evidence
- source
Of these, only subject, relationship and object are particularly well defined. For any Annotation with a subject, object and relationship specified, the mapping works like this:
! This annotation... [Annotation] id: <id> subject: <subject_id> relationship: <relationship_id> object: <object_id> !is equivalent to this instance... [Instance] id: <id> instance_of: oban:annotation relationship: oban:posits <subject_id> -<relationship_id>-> <object_id>
The other tags (assigned_by, evidence, and source) have no well-specified meaning (as far as I know) at this point.
You'll notice that the mapping relies on some pre-defined ontology objects. Those objects are defined in an ontology file that currently exists as a resource in the OBO-Edit source repository, but will probably be moved to the OBO foundry soon. The contents of oban.obo are reproduced below:
default-namespace: oban [Term] id: oban:annotation name: Annotation [Typedef] id: oban:has_data_source name: has data source domain: oban:annotation [Typedef] id: oban:has_evidence name: has evidence domain: oban:annotation range: oban:evidence [Term] id: oban:evidence name: Evidence [Typedef] id: oban:posits name: posits domain: oban:annotation