Difference between revisions of "Reference Genome sequence annotation (Retired)"

From GO Wiki
Jump to: navigation, search
m
 
(24 intermediate revisions by 10 users not shown)
Line 1: Line 1:
 +
[[Category:PAINT Archived]]
 
The Reference Genome initiative will foster [http://www.sequenceontology.org SO] compliant annotations. The sequences will be available using the file format  [http://www.sequenceontology.org/gff3.shtml GFF3].
 
The Reference Genome initiative will foster [http://www.sequenceontology.org SO] compliant annotations. The sequences will be available using the file format  [http://www.sequenceontology.org/gff3.shtml GFF3].
  
 
+
For discussion on standardizing URLs for accessing this information please see the GMOD wiki page
 +
[http://www.gmod.org/wiki/index.php/Standard_URL Standard URL]
  
 
{| border="1"
 
{| border="1"
|+align="bottom" style="color:#e76700;"|''Where to find SO compliant GFF3 annotations for the Reference Genome sequences.''
+
|+align="bottom" style="color:#e76700;"|''Where to find SO compliant GFF3 annotations for the Reference Genome sequences. ( * means that the presented file is not yet SO compliant)''
 
|'''Organism'''
 
|'''Organism'''
 
|'''Organization'''
 
|'''Organization'''
Line 23: Line 25:
 
|[http://www.yeastgenome.org SGD]
 
|[http://www.yeastgenome.org SGD]
 
|ftp://genome-ftp.stanford.edu/pub/yeast/chromosomal_feature/saccharomyces_cerevisiae.gff
 
|ftp://genome-ftp.stanford.edu/pub/yeast/chromosomal_feature/saccharomyces_cerevisiae.gff
|?
+
|updated nightly
 
|-
 
|-
 
|Dictyostelium    (cellular slime mold)
 
|Dictyostelium    (cellular slime mold)
 
|[http://www.dictybase.org dictyBase]
 
|[http://www.dictybase.org dictyBase]
 
|http://www.dictybase.org/db/cgi-bin/dictyBase/download/download.pl?area=gff3&ID=dicty_gff3.zip
 
|http://www.dictybase.org/db/cgi-bin/dictyBase/download/download.pl?area=gff3&ID=dicty_gff3.zip
|?
+
|updated weekly
 
|-
 
|-
 
|Arabidopsis thaliana
 
|Arabidopsis thaliana
Line 36: Line 38:
 
|-
 
|-
 
|Danio rerio (Zebrafish)
 
|Danio rerio (Zebrafish)
|[http://zfin.org/cgi-bin/webdriver?MIval=aa-ZDB_home.apg ZFIN]
+
|[http://zfin.org ZFIN] [http://www.sanger.ac.uk Sanger Institute]
|?
+
|Vega:  ftp://ftp.sanger.ac.uk/pub/vega/danio/gff3/vega_danio_rerio_20070803.gff3 <br> Ensembl*: ftp://ftp.ensembl.org/pub/current_gtf/Danio_rerio.ZFISH7.47.gtf.gz
|?
+
|With new releases
 
|-
 
|-
 
|Mouse
 
|Mouse
Line 46: Line 48:
 
|-
 
|-
 
|Human
 
|Human
|?
+
|[http://www.ebi.ac.uk/GOA/ GOA]
|?
+
| ftp://ftp.ensembl.org/pub/current_gtf/Homo_sapiens.NCBI36.47.gtf.gz
 
|?
 
|?
 
|-
 
|-
 
|Schizosaccharomyces pombe (Fission yeast)
 
|Schizosaccharomyces pombe (Fission yeast)
 
|[http://www.sanger.ac.uk/Projects/S_pombe/ Sanger Centre]
 
|[http://www.sanger.ac.uk/Projects/S_pombe/ Sanger Centre]
|ftp://ftp.sanger.ac.uk/pub/yeast/pombe/GFF/
+
| * ftp://ftp.sanger.ac.uk/pub/yeast/pombe/GFF/
|3/16/07
+
|7/17/08
 
|-
 
|-
 
|E.coli
 
|E.coli
Line 62: Line 64:
 
|Rat
 
|Rat
 
|[http://rgd.mcw.edu/ RGD]
 
|[http://rgd.mcw.edu/ RGD]
|?
+
|ftp://ftp.ensembl.org/pub/current_gtf/Rattus_norvegicus.RGSC3.4.47.gtf.gz
 
|?
 
|?
 
|-
 
|-
 
|}
 
|}
 +
 +
== Nota bene ==
 +
# The human and zebrafish Ensembl data is in GTF (not GFF3)
 +
# The rat Ensembl data (link provided) is also in GTF format
 +
# The pombe files are not (yet) valid GFF3. The known problems are:
 +
#* extra column 10 "Name"
 +
#* extra column 11 "orf_classification"
 +
#* extra column 12 "gene"
 +
#* extra column 13 "chr"
 +
#* the mandatory "phase" column isn't filled in.
 +
#* and the attributes" column may not be formatted correctly.
 +
 +
Back to: [[Reference_Genome_Focus]]

Latest revision as of 07:16, 12 April 2019

The Reference Genome initiative will foster SO compliant annotations. The sequences will be available using the file format GFF3.

For discussion on standardizing URLs for accessing this information please see the GMOD wiki page Standard URL

Where to find SO compliant GFF3 annotations for the Reference Genome sequences. ( * means that the presented file is not yet SO compliant)
Organism Organization Download Date
Drosophila melanogaster (Fruitfly) FlyBase ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/current/gff 9/12/07
Caenorhabditis elegans (Worm) WormBase ftp://ftp.wormbase.org/pub/wormbase/genomes/elegans/genome_feature_tables/GFF3 ?
Saccharomyces cerevisiae (Budding yeast) SGD ftp://genome-ftp.stanford.edu/pub/yeast/chromosomal_feature/saccharomyces_cerevisiae.gff updated nightly
Dictyostelium (cellular slime mold) dictyBase http://www.dictybase.org/db/cgi-bin/dictyBase/download/download.pl?area=gff3&ID=dicty_gff3.zip updated weekly
Arabidopsis thaliana TAIR ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR7_genome_release/TAIR7_gff3 8/15/07
Danio rerio (Zebrafish) ZFIN Sanger Institute Vega: ftp://ftp.sanger.ac.uk/pub/vega/danio/gff3/vega_danio_rerio_20070803.gff3
Ensembl*: ftp://ftp.ensembl.org/pub/current_gtf/Danio_rerio.ZFISH7.47.gtf.gz
With new releases
Mouse MGI ? ?
Human GOA ftp://ftp.ensembl.org/pub/current_gtf/Homo_sapiens.NCBI36.47.gtf.gz ?
Schizosaccharomyces pombe (Fission yeast) Sanger Centre * ftp://ftp.sanger.ac.uk/pub/yeast/pombe/GFF/ 7/17/08
E.coli ASAP https://asap.ahabs.wisc.edu/asap/download_gff3.php?LocationID=&SequenceVersionID=&GenomeID= ?
Rat RGD ftp://ftp.ensembl.org/pub/current_gtf/Rattus_norvegicus.RGSC3.4.47.gtf.gz ?

Nota bene

  1. The human and zebrafish Ensembl data is in GTF (not GFF3)
  2. The rat Ensembl data (link provided) is also in GTF format
  3. The pombe files are not (yet) valid GFF3. The known problems are:
    • extra column 10 "Name"
    • extra column 11 "orf_classification"
    • extra column 12 "gene"
    • extra column 13 "chr"
    • the mandatory "phase" column isn't filled in.
    • and the attributes" column may not be formatted correctly.

Back to: Reference_Genome_Focus