Elements of an annotation: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
This page describes the different fields found in GAF/GPAD files. | |||
=Elements of an annotation= | =Elements of an annotation= | ||
==Annotation Subject== | ==Annotation Subject== |
Revision as of 09:37, 3 April 2019
This page describes the different fields found in GAF/GPAD files.
Elements of an annotation
Annotation Subject
- Annotations subjects consists of valid database identifiers, such as WB:WBGene00003721 or SGD:S000001048.
- Annotations subjects may be genes or gene products (e.g. proteins, including specific isoforms; ncRNAs; and protein complexes)
- The list of valid database prefixes can be found on the GO website.
Relations
- Annotation Subjects and GO terms are linked by a Relation from the Relations Ontology (note that this information is unavailable GAF files, but explicit in GPAD files). Guidelines for usage of Relations can be found on the Annotation_Relations page.
- The specific relations applicable to each aspect of GO are listed in the Annotation_Relations page.
Negation
- NOT is used to make an annotation statement that the gene product is not associated with the GO term.
- When combined with an explicit annotation relation, e.g. enables, the NOT qualifier indicates that the gene product does not have that relationship to the GO term.
- NOT may be used with terms from any of the three ontologies.
In practice, the NOT qualifier is used in two ways:
- When a GO term might otherwise be expected to apply to a gene product, but an experiment, sequence analysis, etc. demonstrates otherwise.
- When there is conflicting experimental findings in the literature and curators would like to accurately capture all relevant data.
Use of the NOT qualifier is particularly important in cases where associating a GO term with a gene product should be avoided (but might otherwise be made, especially by an automated method). For example, if a protein has sequence similarity to an enzyme (whose activity is represented as Molecular Function GO:nnnnnnn), but has been shown experimentally not to have the enzymatic activity, it can be annotated as NOT GO:nnnnnnn.
In phylogenetic-based annotation, i.e. PAINT, the NOT qualifier is used in conjunction with the IKR (Inferred from Key Residue) evidence code. Here, NOT is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential residues and is very unlikely to be able to carry out an associated function, participate in the expected associated process, or be found in a certain location.
The NOT qualifier is not used to annotate negative or inconclusive experimental results.
GO term
A gene product can be annotated to zero or more terms from each ontology.
- Guidelines for annotating the different aspects of GO can be found in the Annotation#Ontology-Specific_Guidelines section.
- Guidelines for certain specific topics are in the Annotation#Topic-Specific_Guidelines section.
Annotation Extensions
- Annotation extensions may be added to GO annotations to provide additional contextual information for the assertion.
- Annotation extensions are structured text that use a relation from the Relations Ontology and an appropriate biological concept or entity to modify the GO annotation, e.g. nucleus 'part of' epithelial cell.
- Detailed documentation on curation using annotation extensions can be found here: Annotation_Extensions
Evidence
- Each annotation must indicate what kind of evidence supports the association between the gene product and the GO term.
- Annotations use Evidence Codes from the Evidence and Conclusions Ontology.
- See the Guide to GO Evidence Codes for details.
Reference
- Every annotation must also cite a reference, which may be either a publication or an internal GO reference (https://github.com/geneontology/go-site/tree/master/metadata/gorefs).
Assigned_by
Every annotation is marked with the name of the group that made the annotation. The group that made the annotation may be different from the database who manages the identifiers and/or the annotation file.