Annotation Guidelines

From GO Wiki
Revision as of 07:23, 5 October 2022 by Pascale (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
 This is the new Annotation Guidelines page (to be completed & reviewed)

New proposed index page: https://docs.google.com/document/d/1UlAsmGOrEbpOaKlCApzdbvt48D_EMgA5gho5ptfTSvE/edit#

Need to include this information: https://docs.google.com/document/d/1DS2ObQc1EVcQKAHjKNt-0jzFCSSVLDQ4Osm-1dsTwdA/edit

General introduction to GO annotation

A general introduction to the Gene Ontology and GO annotations is available on the GO website.  

Standard GO annotations are defined as an association between a gene and a biological concept from one of the three GO aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an  Evidence Code that indicates the type of experiment or method used to make the assertion.

  • Annotation extensions. Standard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the Relations Ontology (RO) and a term from GO or an external ontology, e.g. UBERON.
  • Unknown MF, BP, CC. If any aspect is unknown, an annotation should be made to the root term. This means that the gene has some molecular function, that is part of some biological process that occurs in some cellular component, but one or more of these aspects may not be known.

Causal Activity Models (GO-CAM Models) provide link activities performed by gene products in a causal framework, using relations from the Relations Ontology RO relations. GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathway. GO-CAMS thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics. A summary of the GO-CAM model specifications is presented in Figure 1.

  • Activity Unit. The basic unit of a GO-CAM model is the Activity Unit, outlined in Figure 1, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function. GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process. Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.
Figure 1. GO-CAM Model Specifications


 TO INCLUDE SOMEWHERE (2)
GO-CAM annotation workflow
 The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system. To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled. See also Tips_to_Produce_High_Quality_Annotations.
 It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
 When making a GO-CAM model, we suggesting these steps:
 * What are the main activities (MFs) of each of the gene products in a model?
 * How do those activities relate, in a causal chain, to each other?
 * What processes are those activities involved in?
 * Where do the activities occur?
 Even when annotating a single paper, try to incorporate as much of this workflow as possible. This will make it easier, in the future, to build on existing models with new curation.

Review Status

Draft: 2021-02-28 Patrick Masson, Pascale Gaudet