Variant annotation

From GO Wiki
Revision as of 15:32, 12 November 2007 by Vpetri (talk | contribs) (Rattus norvegicus)

Jump to: navigation, search

Arabidopsis thaliana

Caenorhabditis elegans

Danio rerio

We almost never have enough info to curate to the level of a splice variant. Our annotations are applied at the level of the gene.

Dictyostelium discoideum

So far we only have a few genes and publications that described splice variants, and the papers never described different functions for the different variants. Hence, we currently don't capture annotations to different variants of gene products.

Drosophila melanogaster

Escherichia coli

Gallus gallus

We are using UniProtKB accession IDs wherever possible and this allows us to annotate specific isoforms if required.

Homo sapiens

The human group annotates to UniProtKB accessions. When a paper provides isoform-specific information, then this data can be captured using the appropriate UniProt isoid. E.g. Q4VCS5-1, Q4VCS5-2. When isoform-specific information is not provided then the top-level UniProt accession number is only annotated to, e.g. Q4VCS5.

Mus musculus

For each annotation, MGI has a "notes field" that is not available to the public. That note has a structure as follows:

evidence:
anatomy:
cell type:
gene product:
qualifier:
target:
external ref:
text:

If a paper actually specifies a specific isoform, the appropriate refseq is entered into the "gene_product" field

eg, For the annotation of MGI:1341722,Kcnh2,to GO:0005886, plasma membrane,by IDA, the field would look like:

gene_product:SPKW:O35219-1

We presently only have about 300 of these with experimental evidence codes, annotated after the adoption of the structured notes. So QC has to be done for some. Annotations done prior to that will not have any entry, as we had no way of capturing the data. We are looking at ways to "back annotate" by identifying having multiple isoforms identified in references that have been used for GO annotation at MGI.

Rattus norvegicus

There are not too many splice variants currently in the database. Those that are have their own DB:ID, get the symbol of the parent gene with underscore vnumber followed by variant of symbol in parentheses with symbol hyperlinked to the report page of the parent gene. Example:geneX_v1 (variant of geneX). The variants can also be accessed from the top level gene. The variants may have some mapping, sequence, other external database links, if applicable. They seldom have annotations. It may happen that the information in the literature allows for annotation of the splice variants but that is rather rare.

Saccharomyces cerevisiae

Schizosaccharomyces pombe