Relationship type requirements
This page summarises the requirements for relations used in future versions of the GO, for example in Category:Cross Products
- 1 Core GO
- 2 Regulation
- 3 Gene products
- 4 Function Process Links
- 5 Cellular component function links
- 6 Cellular component
- 7 Development and anatomical processes
- 8 Biochemical processes
These have always been used in GO:
part_of is used exactly as defined in RO, following the ALL-SOME pattern
There may also be need of
The relation between the gene product and the type represented in the GO term is implicit in a GAF (Gene Association File). This is fine for the current GO and most database applicatons. However, when we add more relations to GO we need a framework for deciding how the relation between the gene product ID and the GO ID propagates.
Between a gene product and a molecular function
Between a gene product and a biological process
The has_participant relation links continuants to processes. However, This relation is too weak. We would not annotate all the participants in a GO:secretion process to that GO term - the products that are secreted are not actively participating in the process. We need to say that the product executes some function whilst participating in the process
Between a gene product and a cellular component
The located_in relation is correct but not sufficient. For example, all gene products start located_in the ribosome but we do not annotate to this: the gene product does not have its end destination there or does not function there. We want to indicate that the gene product executes its function in this location
Function Process Links
- #Regulation links
We need ALL-SOME links in both directions.
If MF represents functions, then part relations are not accurate. We need
- realizes / realized_by
- GO:0000118 histone deacetylase complex MF has_function GO:0004407 histone deacetylase activity
For more details, see XP:cellular_component_xp_self#Relations
Reactome does some QA/QC using a subset of GO-CC; but they would like to do more advanced reasoning:
The list has some known problems. We haven't dealt properly with the fact that mitochondria are actually four-part structures (outer membrane, intermembrane space, inner membrane, matrix), nor with the fact that the nuclear membrane likewise has inner and outer components separated by an accessible space and, worse, is continuous with endoplasmic reticulum. We don't have a good way of dealing with specialized domains of membranes like lipid rafts because they don't have unambiguous boundaries, and we likewise can't handle a structure like nucleosome or interphase chromatin because its innards are freely accessible from the nucleoplasm. We are playing with the idea of giving each of these terms two new attributes: - is_surrounded_by (e.g., plasma membrane is surrounded by extracellular; endosome membrane is surrounded by cytosol), to help with the spatial reasoning. - transforms_into to handle problems like annotating the disappearance of the nuclear membrane in human cells at mitosis.
Development and anatomical processes
The RO defines the has_participant relation, linking processes to objects such as cells and organs. We assume that participation is very general, so that a developing entity is said to participate in the process of development
The GO process guidelines shows the upper ontology template for development:
x development [p] x morphogenesis ---[p] x formation ------[p] y cell differentiation ---[p] x structural organization [p] x maturation
and for cells:
cellular process [i] cell differentiation ---[p] cell fate commitment ------[p] cell fate specification ------[p] cell fate determination ---[p] cell development ------[p] cellular morphogenesis during differentiation ------[p] cell maturation
Our first attempt used just the has_participant relation. Terms were defined using a generic process (development, morphogenesis etc) and a differentium of the form has_participant <Entity>. However, this was not specific enough as it does not indicate what role the participant plays. We also tried has_central_participant but this was difficult to define.
None of this caused problems for reasoning with the above terms, as they are all treated as if they have a single participant. However, it is still somewhat unsatisfying.
In addition, these is no genus term to define terms like neuron remodeling.
Also, GO does not have a single genus term for representing the complete development of a structure; "anatomical structure development" is used for CCs and gross anatomical entities, and "cell development" is used for cells. Of course, GO could be extended to include such a genus term (perhaps anatomical structure development)?
The current strategy is to define development terms using more specific relations corresponding to the upper ontology template:
[Term] id: GO:0007281 ! germ cell development intersection_of: GO:0048468 ! cell development intersection_of: OBO_REL:results_in_complete_development_of CL:0000586 ! germ cell
This is somewhat redundant: we could just use biological_process as the genus
The current set of xp defs from UCHSC uses a large set of relations:
Can we use a smaller set?
My best thinking in this area currently is that reactions and interactions are expressed as a the existence of a minimal process with the necessary participants and no others, where we believe there are no others. The basic relationship used is participation in the process. Specific participants have different functions. New relations are subproperties of has_participant that add additional conditions on domain and ranges, with the domain being the process type and the range being the role type. The reason that I aim for this is that I don't want a slew of new relations that don't integrate with each other. Note that a basic issue with describing such processes is that there are parameters that only make sense at the granularity of concentrations and pools and others that make more sense at the molecular level. So the typical catalytic reaction 2 a + b <= E => c + 3 d would be represented as two processes: R1 =def Process and has_participant exactly 3 (a and has_role some substrate) and has_participant exactly one (b and has_role some substrate) and has_participant exactly 1 (c and has_role some product) and has_participant exactly 1 (d and has_role some product) and has_participant exactly 1 (E and has_function some <the catalytic_activity>) The second process R2 is the same as above, but with the substrate and product roles reversed. Given the above schema, we could define additional relations that a) imply the above relations or b) provide information which we know we won't be able to infer from them. The semantics of preceding and following events need to be clarified. At the molecular level, there are two possibilities - 1) enablement: The following reaction is enabled because the substrates of the following process can be supplied by the previous process. 2) temporal ordering: There is a larger process in which the first always precedes the first. At the concentration pool level this sort of thing doesn't make sense because at any given time multiple instances of many of the molecular processes are happening at one. My main concern at the meeting will be to avoid creation of a lot of new relations that are not carefully defined in terms of previous relations wherever possible.