Elements of an annotation: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(26 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
This page describes the different annotation fields.
  TO BE COMPLETED: Add something about annotation extensions + isoforms
 
 


=Elements of an annotation=
=Elements of an annotation=
==Annotation Subject==
==Annotation Subject==
* Annotations subjects consists of valid database identifiers, such as WB:WBGene00003721 or SGD:S000001048.  
* Annotations subjects consists of valid database identifiers, such as WB:WBGene00003721, SGD:S000001048, or UniProtKB:P99999.  
* Annotations subjects may be genes or gene products (e.g. proteins, including specific isoforms; ncRNAs; and protein complexes)
* Annotations subjects may be genes or gene products (e.g. proteins, including specific isoforms; ncRNAs; and protein complexes)
* The list of valid database prefixes can be found on the [http://amigo.geneontology.org/xrefs GO website].
* The list of valid database prefixes can be found on the [http://amigo.geneontology.org/xrefs GO website].


==Relations==
==Relations==
* Annotation Subjects and GO terms are linked by a '''Relation''' from the [https://github.com/oborel/obo-relations Relations Ontology] (note that this information is unavailable [https://geneontology.github.io/docs/go-annotation-file-gaf-format-21/ GAF files], but explicit in [https://geneontology.github.io/docs/gene-product-association-data-gpad-format/ GPAD files]). Guidelines for usage of Relations can be found on the [[Annotation_Relations]] page.
* Annotation Subjects and GO terms are linked by a '''Relation''' from the [https://github.com/oborel/obo-relations Relations Ontology].
* The specific relations applicable to each aspect of GO are listed in the [[Annotation_Relations]] page.
* The relations applicable to each aspect of GO as well as usage guidelines can be found in the [[Annotation_Relations]] page.


==Negation==
==Negation==


*NOT is used to make an annotation statement that the gene product is not associated with the GO term.
* The NOT statement indicates that the gene product does not enable a Molecular Function, is not part of a Biological Process or is not located in a specific Cellular Component.
*When combined with an explicit annotation relation, e.g. enables, the NOT qualifier indicates that the gene product does not have that relationship to the GO term.
* NOT statements are only used when a user might expect that the gene product would have a specific biological property (MF, BP or CC).  
*NOT may be used with terms from any of the three ontologies.
* Contrary to positive annotations, NOT statements propagate ''down'' the ontology, such that the annotation <code>gene product NOT enables protein kinase activity</code> means that the gene product does not enable protein serine/threonine kinase activity or protein tyrosine kinase activity either.
 
* Both positive and NOT statements can be used in cases where there is conflicting experimental findings in the literature.
In practice, the NOT qualifier is used in two ways:
* If an isoform has a different function from the main isoform represented by the gene-centric entity, a NOT annotation can be captured together with the isoform identifier.
 
* NOT annotations can be supported by experiments that show the lack of activity (or BP, CC), or based on sequence data that it has lost essential residues and is unlikely to be able to carry out a function, participate in a process, or be found in a certain location. In this case the evidence code is [http://wiki.geneontology.org/index.php/Inferred_from_Key_Residues_(IKR) IKR (Inferred from Key Residue)] evidence code.
#When a GO term might otherwise be expected to apply to a gene product, but an experiment, sequence analysis, etc. demonstrates otherwise.
#When there is conflicting experimental findings in the literature and curators would like to accurately capture all relevant data.
 
Use of the NOT qualifier is particularly important in cases where associating a GO term with a gene product should be avoided (but might otherwise be made, especially by an automated method). For example, if a protein has sequence similarity to an enzyme (whose activity is represented as Molecular Function GO:nnnnnnn), but has been shown experimentally not to have the enzymatic activity, it can be annotated as NOT GO:nnnnnnn.  


In phylogenetic-based annotation, i.e. PAINT, the NOT qualifier is used in conjunction with the [http://wiki.geneontology.org/index.php/Inferred_from_Key_Residues_(IKR) IKR (Inferred from Key Residue)] evidence code.  Here, NOT is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential residues and is very unlikely to be able to carry out an associated function, participate in the expected associated process, or be found in a certain location.
'''When NOT to use negation'''
* The NOT qualifier should be not used to capture every experimental result.  
** For example in a subcellular localization experiment, locations where the gene product is not found should not be captured, unless it is explicitly needed. If a protein is found in the nucleus but not in the mitochondrion, usually a <code>NOT located in mitochondrion</code> annotation would be inappropriate,
* The NOT qualifier should be not used to annotate negative or inconclusive experimental results.
** For example, if a mutant develops a specific anatomical structure, it doesn't imply that it does not contribute to the process; the experiment may just not allow to make the positive conclusion.
* The NOT qualifier should be not used to describe experimental conditions or specific contexts in which the gene product is not active (i.e, should not be used in combination with an extension).


'''The NOT qualifier is not used to annotate negative or inconclusive experimental results.'''
'''Examples'''
* '''MNN4 (CGD:CAL0000174110) NOT biological process involved in interspecies interaction between organisms''' from PMID:15271989, based on the result that loss of cell wall mannosylphosphate in Candida albicans does not influence macrophage recognition. This NOT annotation means that MNN4 is never involved in any biological process involved in interspecies interaction between organisms, which the data from the paper does not allow to conclude.


==GO term==
==GO term==
A gene product can be annotated to zero or more terms from each ontology.
* A gene product can be annotated to zero or more terms from each ontology.
* Guidelines for annotating the different aspects of GO can be found in the [[Annotation#Ontology-Specific_Guidelines]] section.  
* Guidelines for certain specific topics are in the [[Annotation#Topic-Specific_Guidelines]] section.
* Guidelines for certain specific topics are in the [[Annotation#Topic-Specific_Guidelines]] section.


== AnnotationExtensions ==
== Annotation Extensions ==
[[Annotation_Extensions]]
* Annotation extensions may be added to GO annotations to provide additional contextual information for the assertion.
  TO REVIEW
* Annotation extensions are structured text that use a relation from the Relations Ontology and an appropriate biological concept or entity to modify the GO annotation, e.g. nucleus 'part of' epithelial cell.
* Detailed documentation on curation using annotation extensions can be found here: [[Annotation_Extension]]


==Evidence==
==Evidence==
Line 50: Line 49:
== Assigned_by ==
== Assigned_by ==
Every annotation is marked with the name of the group that made the annotation.  
Every annotation is marked with the name of the group that made the annotation.  
The group that made the annotation may be different from the database who manages the identifiers and/or the annotation file.  
The group that made the annotation may be different from the database who manages the identifiers and/or the annotation file.
 
 


== Date ==
The date the annotation was made or last edited, in YYYYMMDD format.


==Old wiki pages to review ==
== Review Status ==
*[[Beginning_Annotation_SOP]]
* [[Top level tree]] - This tree leads to all others. - Jennifer
* [[Electronic - Harold and Evelyn]]
* [[ISS - Pascale]] and Michelle
* [[Manual - Jennifer]]
* [[user categories]]
* [[Submit_GO_annotations]]
* [[Chain_of_Evidence]]


Last reviewed: 2022-04-19




[[Category: Annotation]]
[[Category: Annotation]]

Latest revision as of 08:17, 28 April 2022

This page describes the different annotation fields.

Elements of an annotation

Annotation Subject

  • Annotations subjects consists of valid database identifiers, such as WB:WBGene00003721, SGD:S000001048, or UniProtKB:P99999.
  • Annotations subjects may be genes or gene products (e.g. proteins, including specific isoforms; ncRNAs; and protein complexes)
  • The list of valid database prefixes can be found on the GO website.

Relations

  • Annotation Subjects and GO terms are linked by a Relation from the Relations Ontology.
  • The relations applicable to each aspect of GO as well as usage guidelines can be found in the Annotation_Relations page.

Negation

  • The NOT statement indicates that the gene product does not enable a Molecular Function, is not part of a Biological Process or is not located in a specific Cellular Component.
  • NOT statements are only used when a user might expect that the gene product would have a specific biological property (MF, BP or CC).
  • Contrary to positive annotations, NOT statements propagate down the ontology, such that the annotation gene product NOT enables protein kinase activity means that the gene product does not enable protein serine/threonine kinase activity or protein tyrosine kinase activity either.
  • Both positive and NOT statements can be used in cases where there is conflicting experimental findings in the literature.
  • If an isoform has a different function from the main isoform represented by the gene-centric entity, a NOT annotation can be captured together with the isoform identifier.
  • NOT annotations can be supported by experiments that show the lack of activity (or BP, CC), or based on sequence data that it has lost essential residues and is unlikely to be able to carry out a function, participate in a process, or be found in a certain location. In this case the evidence code is IKR (Inferred from Key Residue) evidence code.

When NOT to use negation

  • The NOT qualifier should be not used to capture every experimental result.
    • For example in a subcellular localization experiment, locations where the gene product is not found should not be captured, unless it is explicitly needed. If a protein is found in the nucleus but not in the mitochondrion, usually a NOT located in mitochondrion annotation would be inappropriate,
  • The NOT qualifier should be not used to annotate negative or inconclusive experimental results.
    • For example, if a mutant develops a specific anatomical structure, it doesn't imply that it does not contribute to the process; the experiment may just not allow to make the positive conclusion.
  • The NOT qualifier should be not used to describe experimental conditions or specific contexts in which the gene product is not active (i.e, should not be used in combination with an extension).

Examples

  • MNN4 (CGD:CAL0000174110) NOT biological process involved in interspecies interaction between organisms from PMID:15271989, based on the result that loss of cell wall mannosylphosphate in Candida albicans does not influence macrophage recognition. This NOT annotation means that MNN4 is never involved in any biological process involved in interspecies interaction between organisms, which the data from the paper does not allow to conclude.

GO term

Annotation Extensions

  • Annotation extensions may be added to GO annotations to provide additional contextual information for the assertion.
  • Annotation extensions are structured text that use a relation from the Relations Ontology and an appropriate biological concept or entity to modify the GO annotation, e.g. nucleus 'part of' epithelial cell.
  • Detailed documentation on curation using annotation extensions can be found here: Annotation_Extension

Evidence

Reference

Assigned_by

Every annotation is marked with the name of the group that made the annotation. The group that made the annotation may be different from the database who manages the identifiers and/or the annotation file.

Date

The date the annotation was made or last edited, in YYYYMMDD format.

Review Status

Last reviewed: 2022-04-19