Database cross references

From GO Wiki
Jump to navigation Jump to search
 TO BE REVIEWED

Cross-referencing other databases

General database cross-references, or general dbxrefs, should be used where a GO term is identical to an object in another database. For more information on syntax, please refer to the GO File Format Guide and for a complete list of dbxrefs, see the database cross-references page.


General Rules and Things of Note

Adding database cross-references for Definitions versus cross-references for Terms

  * Add a database cross-reference whenever you can
  * If the database was used to help generate the term Definition, you should add the database cross-reference to the term Definition directly.
  * If you see a case where there is a database cross-reference for the Definition and no database cross-reference for the term, check the xref and add it to the term if appropriate.
  * If the database cross-reference in the text Definition is a partial EC number, it should be removed.
  * Since we will use Rhea as the source of term definitions and inferences in the future, Rhea xrefs are special. Use them only if they are 1:1. If the reaction is generic (bidirectional) use the Rhea identifier that corresponds to the agnostic reaction. The agnostic reaction is the one in Rhea that has an '=' between the reactants and products, as opposed to '<=>' or '=>'. 

Rhea

Use the non-directed RHEA term (=).

Enzyme Commission

The Enzyme Commission names and categorizes enzymes based on reaction mechanism. In the same way that a gene product may participate in a number of different processes, it may catalyze more than one reaction; the ontology should contain each reaction even if it is enabled by a single gene product. Gene annotators should associate a gene product with all of the molecular functions it can catalyze, or in the case of a GO-CAM model with the specific reaction that happens in the model. A single enzyme may perform a number of different reactions, and it is also possible for several different EC enzymes to perform the same reaction.

This means that there is not a 1:1 correspondence between EC numbers and GO reaction terms.

There are a number of websites that mirror the EC data; one that is particularly useful is IntEnz. It shows the reactions from RHEA, allowing for easy correlation checks between the resources.

Precise vs. Imprecise EC Numbers

GO has terms that represent the categories used by EC. These have EC xrefs of the form EC:n, EC:n.n and EC:n.n.n (where n is a number).

For reactions where the enzyme has not yet been added to EC, but it can be put into one of the EC categories, the xref should be of the form EC:n.n.n.-, i.e. ending with a dash.

One EC number, multiple reactions

There are a number of cases where an enzyme can catalyse a set of reactions. These may or may not be specified by EC, but KEGG and MetaCyc will often show additional reactions. Similarly, there are often different EC enzymes that will catalyse the same reaction. A good example of this overlap is found in EC:1.5.3.13, 14, 15, 16, and 17. Looking at IntEnz, there are four reactions for EC:1.5.3.17; if we then look at EC:1.5.3.16, we can see that one of the reactions from EC:1.5.3.17 can be catalysed by this enzyme, too. KEGG shows this data more clearly; viewing all the reactions for EC:1.5.3.17 (click 'Show all' on the enzyme data page), each reaction has the EC numbers of enzymes that can catalyse it listed. MetaCyc also lists a number of reactions for each EC number.

MetaCyc

At present, MetaCyc reactions are associated with one EC number, so if two different EC enzymes catalyze the same reaction, there will be two MetaCyc reactions, one for each EC number.


KEGG

KEGG makes reactions independent of the EC number; you can look up an EC number and see the reactions that the enzyme performs (e.g. EC:1.1.1.21), or you can look up a reaction and see which EC enzymes perform that reaction (e.g. R01036). Nifty!


Reactome

Reactome provide mappings of their terms to GO terms, so they do the work for us! Whenever a release occurs, GO retrieves the updated mappings and the ontology is updated appropriately.


NAD(P)

Here is the (September 2013) view from the IUBMB committee (Keith Tipton and other members clarified this) after canvasing by Kristian Axelson and Alan Bridge:

Our 'classification Rules (on both websites) clearly state in rule 18: "Where the enzyme can use either coenzyme, this should be indicated by writing NAD(P)+".

For further info see rule 18 on systematic names at http://www.chem.qmul.ac.uk/iubmb/enzyme/rules.html

So the meaning is really: "the enzyme can use both", rather than "the reaction may contain either".


 alditol + NAD(P)+ = aldose + NAD(P)H + H+

means that the enzyme performs

 alditol + NAD+ = aldose + NADH + H+

AND

 alditol + NADP+ = aldose + NADPH + H+

HOWEVER, this is in conflict with the way that GO uses ChEBI. In ChEBI NAD(P), CHEBI:25524, is defined: A coenzyme that may be NAD or NADP. Therefore, it refers to either NAD (CHEBI:13389) or NADP (CHEBI:25523). To classify these types of reactions correctly the specific participants should be indicated in subclass relations. For a gene product that can use both substrates, the information should be captured at the level of annotation by annotating to both children.


Review Status

Last reviewed:

Back to: Editing the Ontology