Disjoint Documentation

From GO Wiki
Jump to navigation Jump to search

Three terms at the top of the process ontology have been made disjoint. This means that no term may have an is_a path to more then one of these terms. A term may be a descendant of several disjoint terms, but only if there is a part_of relationship in the path.

In plainer language this means that no term may be a type of more than one of the disjoint terms. A terms may be a part of several disjoint terms. This is illustrated by the example below:

Terms with boxes round them are all disjoin from one another.



At first glance it isn't so obvious why we should choose these terms to be disjoint. It might make more sense to just make these pairs of terms disjoint:

1) Terms to do with number of organisms

single organism process <----- disjoint_with ------> multiorganism process

It's clear that this makes sense as an instance of a process cannot be taking place in a single organism and in several organisms at the same time. It has to be one or the other. Please note that as we do not currently have a term called single organism process this is just an imaginary example.

2) Terms to do with number of cells

cellular process <------ disjoint with ------> multicellular organism process.

These terms are clearly disjoint as an instance of a process cannot be taking place both in a single cell and in a multi-celled organism at the same time.

Other disjoints

We can make the structures below these terms tidier if we go a step further and add in two more disjoint relationships.

cellular process <------ disjoint with ------> multi-organism process
multicellular organism process <------ disjoint with ------> multi-organism process

It's not intuitively clear that these should be disjoint because it seems that perhaps a multi-organism process could occur on the cellular level. For example, in a nodule on a plan root, if the bacterial cell infecting the plant is signaling to the plant cells, then this process could be considered to be occurring both at the cellular level and involving multiple organisms. The bacterial cell is signaling to the plant cell, and this is a process occurring on the cellular level. As the two cells are from separate organisms this could be considered a multi-organism process.

However there is another way that we can represent this.



In the diagram above the cell-cell signaling process is shown to be a cellular process, but as its relationship with the term multi-organism process is via a part_of relationship with the general process term nodulation, the signaling term does not have a direct is_a paths to the disjoint term multi-organism process. This means that the graph is not representing the signaling term as being both a type of multi-organism process and a type of a cellular process. Structuring the graph in this way allows us to keep the general process terms and their parts neatly arranged with respect to one another, and the disjoint parent terms. Tests to enforce their disjointness enable us to check regularly that we are keeping the parts of general processes as parts rather than as general processes in their own right.

If we didn't make multi-organism process disjoint from multicellular process and cellular process then we could easily end up with the arrangement below. The new structure does not buy us anything in terms of reasoning or information capture and is more cluttered. We cannot detect the new clutter, as we do not have the disjoint check to show it up.



Future Plans

We are experimenting with declaring disjointness within the CC ontology - see Cellular_component_disjoint_classes