Annotating regulation

From GO Wiki
Jump to navigation Jump to search


The GO Consortium recognized quite early on in the development of the Biological Process ontology that there were gene products that participated directly in a process and gene products that regulated a process, positively and/or negatively. But how do curators know to which of these terms they should be annotating and is it possible, for a given process, to annotate the same gene product to both a parent term and one of its associated regulation term? To begin to address these questions here are some guidelines for annotating, or not, to regulation terms:

Guideline 1: Use existing biological knowledge to define the process.

In order to determine whether a gene product participates in a process or regulates that process (or both) curators need to consider the nature of the process. Processes can be considered as ordered assemblies of molecular functions and every process has a beginning, middle, and end. Use existing biological knowledge and the paper being curated as guides. Is there a defined pathway, i.e. distinct molecular functions, and have the gene products that perform those functions been identified? Does the gene product being annotated perform one of those functions or a function outside of the process that might start, stop, or change the rate at which the process proceeds? In reality, the beginning, middle, and end of some processes will be easier to define than others. For example, signaling pathways, such as MAPK signaling, will be easier to define than broader, organismal-level processes such as embryonic development. Curators should use their jugdement, based on the published literature, to guide their annotation. Example: Atg1 Saccharomyces cerevisiae Atg1 encodes a protein kinase that is involved in autophagy: "The process by which cells digest parts of their own cytoplasm; allows for both recycling of macromolecular constituents under conditions of cellular stress and remodeling the intracellular structure for cell differentiation." Atg1 activity is critical for the induction of autophagy, specifically for formation of autophagic vacuoles. Should Atg1 be annotated to autophagic vacuole formation or regulation of autophagic vacuole formation? Authors have used language that could lead curators to make annotations to either term. In this case, annotators need to consider the sum of what is known about the autophagic pathway and Atg1's role in that pathway. Using that knowledge, SGD has annotated Atg1 to the parent process term, autophagic vacuole formation, because once Atg1 is active, the 'go' or 'no go' decision for autophagy has already been made. More upstream genes appear to actually be regulating the autophagic pathway.

Guideline 2: If you aren't sure, consider annotating to the parent process term with the GP2term relation 'acts upstream of or within'.

If the gene product performs one of the functions, annotate directly to the process. If the gene product regulates then it should be annotated to regulation of that process. If you aren't sure what term to use, annotate to the parent process term. As more information about the process becomes available, you may be able to refine your annotations (see Guideline #4 below).

Guideline 3: Improve the ontology by defining, wherever possible, the beginning, middle, and end of a process.

  • Wherever possible, include the beginning, middle, and end of a process in the corresponding term definition. This will help annotators choose the appropriate term for their annotations.

Guideline 4: Revisit annotations when new knowledge becomes available.

GO annotations should reflect the present state of biological knowledge. Therefore, as the understanding of a biological process improves, it may be necessary to revisit and refine existing annotations.

Guideline 5: Annotations based on mutant phenotypes should take mechanism into account.

Mutant phenotypes are often used to make annotations to regulation terms because they fit the criteria of the term definition, i.e. authors report a change in the frequency, rate, or extent of a process. However, in using IMP to correctly make regulation annotations it is important to consider various factors, including: 1) the assay type, 2) nature of the alleles (null vs reduction of function), and 3) molecular identity of the gene product. Again, if it isn't clear that a gene product is involved in regulation, it is better to annotate to the parent process term. Example: muscle contraction and C. elegans mutants In C. elegans, a number of genes can mutate to paralysis or slowed locomotion due to defects in muscle contraction. This includes genes that encode everything from myosin heavy chain to calcium channels to transcription factors. Depending upon the nature of the allele, sometimes the mutant phenotypes for the same gene can lead to both process and regulation terms. In this case, consideration of the process, the nature of the allele (complete or partial loss of function), and the molecular identity of the gene product can guide curators in making the appropriate annotation.

Guideline 6: Some gene products may be annotated to both a process and regulation of that process.

Positive and negative feedback loops are an essential part of many signaling pathways. If one member of a pathway regulates the activity of a different member of the pathway, it could be annotated to both the process and regulation of that process. When annotating gene products involved in a signaling pathway, however, curators should not annotate gene products that directly activate the next gene product in the pathway to regulation of that pathway. For example, MAPKK would not be annotated to positive regulation of MAPKKK cascade just because it phosphorylates and activates MAPK. However, gene products that (for example) feedback on to earlier steps in the pathway, may be annotated to both the parent process term and a regulation term. Example: ERK1/2 ERK1/2 activation requires activity of FRS2alpha which, in turn, is negatively regulated by activated ERK1/2. Could ERK1/2 be annotated to both MAPKKK cascade and negative regulation of MAPKKK cascade? Phosphoprotein Enriched in Astrocytes 15 kDa (PEA-15) Reprograms Growth Factor Signaling by Inhibiting Threonine Phosphorylation of Fibroblast Receptor Substrate 2{alpha} Cases where the presence/absence of one of the members of a pathway is limiting should not be annotated to regulation, e.g. if the amount of a receptor on the surface of a cell regulates the process, the receptor should not be annotated to the regulation term.