Annotation Guidelines: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(47 intermediate revisions by the same user not shown)
Line 6: Line 6:


Need to include this information: https://docs.google.com/document/d/1DS2ObQc1EVcQKAHjKNt-0jzFCSSVLDQ4Osm-1dsTwdA/edit
Need to include this information: https://docs.google.com/document/d/1DS2ObQc1EVcQKAHjKNt-0jzFCSSVLDQ4Osm-1dsTwdA/edit


==General introduction to GO annotation==
==General introduction to GO annotation==
A general introduction to GO annotation is available on the [http://geneontology.org/docs/go-annotations/ GO website].  
A general introduction to the [http://geneontology.org/docs/ontology-documentation/ Gene Ontology] and [http://geneontology.org/docs/go-annotations/ GO annotations] is available on the GO website.  
 
==The three aspects of GO: Molecular Function, Biological Process and Cellular Component==
Each gene should be annotated to at least one of each aspect. If any aspect is unknown, an annotation should be made to the root term. This means that the gene has some molecular function, that is part of some biological process that occurs in some cellular component, but one or more of these aspects may not be known.
 
===Molecular Function===
Wherever possible, curators should strive to select the single most granular GO Molecular Function (MF) term that best describes the overall activity of the gene, gene product, or protein-containing complex being annotated.
If desired, individual "sub-functions" may be captured by using the 'part of' relation between the main MF and its sub-functions.
===Biological Process===
The ultimate aim of GO-CAMs is to create a suite of Biological Process (BP)-centric models that can be used to interrogate causal effects of molecular activities on one another as part of the execution of a larger BP.
When annotating, curators should always think about the BP they are modeling and what MFs are 'part of' that BP.
Additional relations between MFs and BPs, e.g. 'causally upstream of or within', can be used to capture experimental information that, in the future, will be incorporated into a more complete model of that process.
===Cellular Component===
Cellular component annotations are intended to capture where the MF enabled by a gene, gene product, or protein-containing complex occurs.
Cellular component annotations may be further qualified with cell, tissue, and organismal context if that information is germane to the process being modeled.


==Standard annotations and Causal activity Models (GO-CAMs)==
'''Standard GO annotations''' are defined as an association between a gene and a biological concept from one of the three [http://geneontology.org/docs/ontology-documentation/ GO aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC)]. Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an [[Guide_to_GO_Evidence_Codes | Evidence Code]] that indicates the type of experiment or method used to make the assertion.  
To replace 'general' part of http://wiki.geneontology.org/index.php/Noctua
* '''''Annotation extensions'''''. Standard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the [https://obofoundry.org/ontology/ro.html Relations Ontology (RO)] and a term from GO or an external ontology, e.g. [http://obophenotype.github.io/uberon/ UBERON].
Tracked version is here: https://docs.google.com/document/d/1ENeVLFs4y3gzZS6Icwt0VfK0clL8j-gNDt-qgwrsMXI/edit
* '''''Unknown MF, BP, CC'''''. If any aspect is unknown, an annotation should be made to the root term. This means that the gene has some molecular function, that is part of some biological process that occurs in some cellular component, but one or more of these aspects may not be known.


===Standard GO Annotations===
'''[http://geneontology.org/docs/gocam-overview/ Causal Activity Models] (GO-CAM Models)''' provide link activities performed by gene products in a causal framework, using relations from the Relations Ontology [https://obofoundry.org/ontology/ro.html RO relations]. GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathway. GO-CAMS thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics.
[http://geneontology.org/docs/ontology-documentation/ GO annotations] have been a key component of GO since its inception. '''Standard annotations''' are defined as an association between a gene and a biological concept from one of the three GO aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an evidence code that indicates the type of experiment or method used to make the assertion. Standard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the [https://obofoundry.org/ontology/ro.html Relations Ontology (RO)] and a term from GO or an external ontology, e.g. [http://obophenotype.github.io/uberon/ UBERON].
 
===GO-CAM Models===
While standard GO annotations are very useful for discerning basic information about genes, they provide only a partial view of each gene's role in a larger biological context. To provide more comprehensive annotation of genes and link their activities in a causal framework, the GO developed [http://geneontology.org/docs/gocam-overview/ '''Causal Activity Models''' (GO-CAMs)]. Using [https://obofoundry.org/ontology/ro.html RO relations], GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathway. GO-CAMS thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics.
A summary of the GO-CAM model specifications is presented in Figure 1.
A summary of the GO-CAM model specifications is presented in Figure 1.
* ''' ''Activity Unit'''''. The basic unit of a GO-CAM model is the Activity Unit, outlined in Figure 1, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function. GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process. Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.


===Activity Unit===
[[File:GO CAM figure from Google drive.png|thumb|center|Figure 1. GO-CAM Model Specifications|600px]]
The basic unit of a GO-CAM model is the Activity Unit, outlined on the left, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function. GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process. Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.
 
[[File:GO CAM figure from Google drive.png|thumb|center|Figure 1. GO-CAM Model Specifications|800px]]
 
===GO-CAM annotation workflow===
  Modified from http://wiki.geneontology.org/index.php/Noctua#GO-CAM_Workflow
 
  Merge with merge with http://wiki.geneontology.org/index.php/Tips_to_Produce_High_Quality_Annotations ?
 
The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system. To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled.
It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
When making a GO-CAM model, we suggesting these steps:
What are the main activities (MFs) of each of the gene products in a model?
How do those activities relate, in a causal chain, to each other?
What processes are those activities involved in?
Where do the activities occur?
Even when annotating a single paper, try to incorporate as much of this workflow as possible. This will make it easier, in the future, to build on existing models with new curation.
 
 
=== Noctua GO-CAM Editors ===
Noctua has four curation interfaces: 1) the [https://docs.google.com/document/d/1RNbr-2T6CSZfaYKDrom7259IK6ck82vZLThgymyNZ28 Noctua Visual Editor], 2) the Noctua Form, 3) the Noctua Graph Editor, and 4) the Annotation Review Tool.
 
* The [https://docs.google.com/document/d/1RNbr-2T6CSZfaYKDrom7259IK6ck82vZLThgymyNZ28 Noctua Visual Editor] (VPE) is designed for curating GO-CAM models, i.e. models that include at least two activities/MFs linked by causal relations.
* The Noctua Form is a structured annotation form that is recommended for creating 'standard' GO annotations.
* The Noctua Graph Editor is best suited for linking standard annotations together to create a causal model (GO-CAM).
* The Annotation Review tool
 
 
{| class="wikitable"
!colspan="3" | Getting started
|-
!Topic
! Status
! Last reviewed
|-
|[[Noctua Landing Page]]
| Draft
|
|-
|[[Browsing and searching annotations and models]]
| Draft
|
|-
|[[Login]]
| Current
| 2022-05-13
|-
|[[Noctua_Manual:_General_Guidelines_for_Model_Titles |Model titles]]
|
|
|-
!colspan="3" | Getting started
|-
!Topic
! Status
! Last reviewed
|-
|[https://docs.google.com/document/d/1RNbr-2T6CSZfaYKDrom7259IK6ck82vZLThgymyNZ28 Noctua Visual Editor] (VPE)
| Current
| 2022-05-05
|-
|[[Noctua Form]]
| Draft
|
|-
|[[Noctua Graph Editor]]
| Draft
|
|-
| [https://docs.google.com/document/d/1Yo2O7LWj1wdRGbbOlaxX_AFDl99T-P59JOf5YoebX0Q/edit# Annotation Review Tool]
| Current
| 2021-05-26
|-
|}


{| class="wikitable"
!rowspan = "3" | Form Editor
|[http://wiki.geneontology.org/index.php/Noctua_Form_Molecular_Function Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_Biological_Process Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_Cellular_Component Cellular Component]
|-
!rowspan = "3" | Graph Editor
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Molecular_Function Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Biological_Process Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_Cellular_Component Cellular Component]
|-
!colspan="3" |Adding contextual information (annotation extensions)
|-
!rowspan = "3" | Form Editor
|[http://wiki.geneontology.org/index.php/Noctua_Form_MF_Extensions Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_BP_Extensions Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Form_CC_Extensions Cellular Component]
|-
!rowspan = "3" | Graph Editor
|[http://wiki.geneontology.org/index.php/Noctua_Graph_MF_Extensions Molecular Function]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_BP_Extensions Biological Process]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Graph_CC_Extensions Cellular Component]
|-
!colspan="2" | Editing annotations
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Editing_in_form Form Editor]
|-
|[[Graph Editor]]
|-
!colspan="2" | Creating GO-CAMs
|-
!rowspan = "2" | Creating an activity unit
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Creating_an_Activity_Unit_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "2" | Linking Activities
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Linking_Activities_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!colspan = "2" | Model metadata
|-
!rowspan = "3" | Model titles
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_General_Guidelines_for_Model_Titles General Guidelines]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Naming_Models_in_Form Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "2" | Releasing models to production
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Making_Models_Production_in_Form Form Editor] Form Editor]
|-
|[[Graph Editor]]
|-
!rowspan = "7" | Other tips and tricks
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Adding_NOT_Qualifer_in_Form Adding a NOT qualifier to an annotation]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Importing_existing_annotations Importing existing annotations]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Changing_annotation_group Changing annotation group]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Model_validation Model validation]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Running_the_reasoner Running the reasoner]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Viewing_GPAD_export Viewing GPAD export]]
|-
|[http://wiki.geneontology.org/index.php/Noctua_Manual:_Using_templates Using templates]
|-
|}


==External ontologies for annotation==
  TO INCLUDE SOMEWHERE (2)
   from http://wiki.geneontology.org/index.php/Noctua#Ontologies_for_Annotation
GO-CAM annotation workflow
* To provide appropriate biological context to an annotation or an activity unit, additional ontologies may be used either in GO-CAM or in annotation extensions [link].
   The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system. To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled. See also [[Tips_to_Produce_High_Quality_Annotations]].
==Cell and Anatomy Ontologies==
  It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
* Describes the location where processes and functions occur.
  When making a GO-CAM model, we suggesting these steps:
* Describes the location of a GO cellular component.
  * What are the main activities (MFs) of each of the gene products in a model?
* Add list
  * How do those activities relate, in a causal chain, to each other?
==Biological Phase and Life Stage Ontologies==
  * What processes are those activities involved in?
* Describes the temporal period during which processes and functions occur.
  * Where do the activities occur?
* Describes the temporal period during which a cellular component or anatomical entity exists.
  Even when annotating a single paper, try to incorporate as much of this workflow as possible. This will make it easier, in the future, to build on existing models with new curation.
* Add list
==Chemical Ontology (ChEBI) [link]==
* Can be used to capture inputs and outputs of processes and functions.
* GO-CAM uses the Chemical Entities of Biological Interest (ChEBI)
* Sequence Ontology [link]


== Review Status ==
== Review Status ==

Latest revision as of 08:23, 5 October 2022

 This is the new Annotation Guidelines page (to be completed & reviewed)

New proposed index page: https://docs.google.com/document/d/1UlAsmGOrEbpOaKlCApzdbvt48D_EMgA5gho5ptfTSvE/edit#

Need to include this information: https://docs.google.com/document/d/1DS2ObQc1EVcQKAHjKNt-0jzFCSSVLDQ4Osm-1dsTwdA/edit

General introduction to GO annotation

A general introduction to the Gene Ontology and GO annotations is available on the GO website.  

Standard GO annotations are defined as an association between a gene and a biological concept from one of the three GO aspects: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Standard annotations always contain a reference (either a published, peer-reviewed paper or internal GO reference) and an  Evidence Code that indicates the type of experiment or method used to make the assertion.

  • Annotation extensions. Standard GO annotations may further be qualified using annotation extensions that provide additional biological context to a GO term using a relation from the Relations Ontology (RO) and a term from GO or an external ontology, e.g. UBERON.
  • Unknown MF, BP, CC. If any aspect is unknown, an annotation should be made to the root term. This means that the gene has some molecular function, that is part of some biological process that occurs in some cellular component, but one or more of these aspects may not be known.

Causal Activity Models (GO-CAM Models) provide link activities performed by gene products in a causal framework, using relations from the Relations Ontology RO relations. GO-CAMs link GO annotations together with biological entities and external ontology terms to model how a gene functions in the broader context of a biological process or pathway. GO-CAMS thus provide structured descriptions of biological systems and allow for interrogation of causal events in biology through use of clearly defined, and consistently applied, semantics. A summary of the GO-CAM model specifications is presented in Figure 1.

  • Activity Unit. The basic unit of a GO-CAM model is the Activity Unit, outlined in Figure 1, which represents a set of standard GO annotations with select annotation extensions, e.g. the inputs and outputs of a molecular function. GO-CAM models are constructed by filling in as many pieces of relevant information in an Activity Unit as possible and then linking different Activity Units in a causal chain to model a biological process. Thus, GO-CAM models use standard GO annotations as the foundation on which to build more comprehensive representations of biology.
Figure 1. GO-CAM Model Specifications


 TO INCLUDE SOMEWHERE (2)
GO-CAM annotation workflow
 The ultimate goal for GO-CAMs is to create a knowledge graph whereby users can use the GO to traverse a causal representation of a biological system. To that end, curators should try, as much as possible, to make individual annotations in the context of the overall process being modeled. See also Tips_to_Produce_High_Quality_Annotations.
 It can be very helpful to refer to a summary figure from a recent research article or review to help visualize a potential GO-CAM.
 When making a GO-CAM model, we suggesting these steps:
 * What are the main activities (MFs) of each of the gene products in a model?
 * How do those activities relate, in a causal chain, to each other?
 * What processes are those activities involved in?
 * Where do the activities occur?
 Even when annotating a single paper, try to incorporate as much of this workflow as possible. This will make it easier, in the future, to build on existing models with new curation.

Review Status

Draft: 2021-02-28 Patrick Masson, Pascale Gaudet