Relationship type requirements

From GO Wiki
Revision as of 12:58, 29 April 2014 by Gail (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page summarises the requirements for relations used in future versions of the GO, for example in Category:Cross Products

Some of these relations already exist in [1]. Others are pending, in ro_proposed (see description of ro_proposed on RO webpage)

Core GO

These have always been used in GO:

  1. is_a
  2. part_of

part_of is used exactly as defined in RO, following the ALL-SOME pattern

There may also be need of

  1. has_part


  1. regulates
  2. negatively_regulates
  3. positively_regulates

See Category:Regulation

Gene products

The relation between the gene product and the type represented in the GO term is implicit in a GAF (Gene Association File). This is fine for the current GO and most database applicatons. However, when we add more relations to GO we need a framework for deciding how the relation between the gene product ID and the GO ID propagates.

Between a gene product and a molecular function

Between a gene product and a biological process

The has_participant relation links continuants to processes. However, This relation is too weak. We would not annotate all the participants in a GO:secretion process to that GO term - the products that are secreted are not actively participating in the process. We need to say that the product executes some function whilst participating in the process

This is transitive_over the part_of relation: i.e. if G has_function_in P' and P' part_of P then G has_function_in P. This corresponds to what people intuitively expect

Between a gene product and a cellular component

The located_in relation is correct but not sufficient. For example, all gene products start located_in the ribosome but we do not annotate to this: the gene product does not have its end destination there or does not function there. We want to indicate that the gene product executes its function in this location

Function Process Links

See Function-Process Links

We need ALL-SOME links in both directions.

If we treat the MF ontology as representing functionINGs then part_of and has_part will suffice

If MF represents functions, then part relations are not accurate. We need

  • realizes / realized_by

Cellular component function links

  • GO:0000118 histone deacetylase complex MF has_function GO:0004407 histone deacetylase activity

Cellular component

For more details, see XP:cellular_component_xp_self#Relations

  1. surrounded_by
  2. surrounds
  3. enclosed_by

Use case:

Reactome does some QA/QC using a subset of GO-CC; but they would like to do more advanced reasoning:

The list has some known problems. We haven't dealt properly with the fact that mitochondria are actually four-part structures (outer membrane, intermembrane space, inner membrane, matrix), nor with the fact that the nuclear membrane likewise has inner and outer components separated by an accessible space and, worse, is continuous with endoplasmic reticulum. We don't have a good way of dealing with specialized domains of membranes like lipid rafts because they don't have unambiguous boundaries, and we likewise can't handle a structure like nucleosome or interphase chromatin because its innards are freely accessible from the nucleoplasm.

We are playing with the idea of giving each of these terms two new attributes:
- is_surrounded_by (e.g., plasma membrane is surrounded by extracellular; endosome membrane is surrounded by cytosol), to help with the spatial reasoning.
- transforms_into to handle problems like annotating the disappearance of the nuclear membrane in human cells at mitosis.

Development and anatomical processes

The RO defines the has_participant relation, linking processes to objects such as cells and organs. We assume that participation is very general, so that a developing entity is said to participate in the process of development


The GO process guidelines shows the upper ontology template for development:

x development
[p] x morphogenesis
---[p] x formation
------[p] y cell differentiation
---[p] x structural organization
[p] x maturation 

and for cells:

cellular process
[i] cell differentiation
---[p] cell fate commitment
------[p] cell fate specification
------[p] cell fate determination
---[p] cell development
------[p] cellular morphogenesis during differentiation
------[p] cell maturation 

Our first attempt used just the has_participant relation. Terms were defined using a generic process (development, morphogenesis etc) and a differentium of the form has_participant <Entity>. However, this was not specific enough as it does not indicate what role the participant plays. We also tried has_central_participant but this was difficult to define.

None of this caused problems for reasoning with the above terms, as they are all treated as if they have a single participant. However, it is still somewhat unsatisfying.

In addition, these is no genus term to define terms like neuron remodeling.

Also, GO does not have a single genus term for representing the complete development of a structure; "anatomical structure development" is used for CCs and gross anatomical entities, and "cell development" is used for cells. Of course, GO could be extended to include such a genus term (perhaps anatomical structure development)?

The current strategy is to define development terms using more specific relations corresponding to the upper ontology template:

id: GO:0007281 ! germ cell development
intersection_of: GO:0048468 ! cell development
intersection_of: OBO_REL:results_in_complete_development_of CL:0000586 ! germ cell

This is somewhat redundant: we could just use biological_process as the genus

Relations used:

For cells:

Biochemical processes

See XP:biological_process_xp_chebi

The current set of xp defs from UCHSC uses a large set of relations:

  • results_in_division_of
  • results_in_perception_of
  • ...

Can we use a smaller set?

From Alan:

My best thinking in this area currently is that reactions and
interactions are expressed as a the existence of a minimal process
with the necessary participants and no others, where we believe there
are no others. The basic relationship used is participation in the
process. Specific participants have different functions. New relations
are subproperties of has_participant that add additional conditions on
domain and ranges, with the domain being the process type and the
range being the role type.

The reason that I aim for this is that I don't want a slew of new
relations that don't integrate with each other.

Note that a basic issue with describing such processes is that there
are parameters that only make sense at the granularity of
concentrations and pools and others that make more sense at the
molecular level.

So the typical catalytic reaction 2 a + b <= E => c + 3 d would be represented as two processes:

R1 =def Process and 
  has_participant exactly 3 (a and has_role some substrate) and
  has_participant exactly one (b and has_role some substrate) and
  has_participant exactly 1 (c and has_role some product) and 
  has_participant exactly 1 (d  and has_role some product) and 
  has_participant exactly 1 (E and has_function some <the catalytic_activity>)

The second process R2 is the same as above, but with the substrate and product roles reversed.

Given the above schema, we could define additional relations that a)
imply the above relations or b) provide information which we know we
won't be able to infer from them.

The semantics of preceding and following events need to be
clarified. At the molecular level, there are two possibilities - 1)
enablement: The following reaction is enabled because the substrates
of the following process can be supplied by the previous process. 2)
temporal ordering: There is a larger process in which the first always
precedes the first.

At the concentration pool level this sort of thing doesn't make sense
because at any given time multiple instances of many of the molecular
processes are happening at one.

My main concern at the meeting will be to avoid creation of a lot of
new relations that are not carefully defined in terms of previous
relations wherever possible.