Reference Genome Annotation Project
From GO Public
Goal of the Reference Genome Annotation Project
The GO consortium has established the complete annotation of 12 reference genomes as a priority goal. These reference genomes are:
Arabidopsis thaliana, Caenorhabditis elegans, Danio rerio, Dictyostelium discoideum, Drosophila melanogaster, Escherichia coli, Gallus gallus, Homo sapiens, Mus musculus, Rattus norvegicus, Saccharomyces cerevisiae, Schizosaccharomyces pombe
The Reference Genome GO Annotation Team, with representatives from each genome annotation group, will coordinate annotation, facilitate implementation of GO Consortium annotation priorities, provide metrics to assess progress toward the goal of broad and deep annotation of the reference genomes. This group will be responsible for the coordination of the annotation of the twelve reference genomes. This group represents the annotation expertise within the GO consortium and provides key liaisons to the model organism databases the have primary responsibilities for the annotation of the reference genomes.
More information
Reference Genome Annotation Project Summary
Reference_Genome Contact Persons from each database
Progress Reports
- March 2010 RefGen Reference_Genome_March_2010
- Progress report for GO SAB meeting April 1, 2010 Reference_Genome_SAB_04_2010
- February 2010 RefGen Reference_Genome_February_2010
- January 2010 RefGen Reference_Genome_January_2010
- December 2009 RefGen Reference_Genome_December_2009
- November 2009 RefGen Reference_Genome_November_2009
- October 2009 RefGen Reference_Genome_October_2009
- September 2009 RefGen Reference_Genome_September_2009
- July 2009 RefGen RefGenProgress_2009-07
- June 2009 RefGen RefGenProgress_2009-06
- May 2009 RefGen RefGenProgress_2009-05
- April 2009 RefGen RefGenProgress_2009-04
- April 1, 2009 Third Reference Genome Annotation Meeting, Eugene, OR Agenda and Minutes
- March 2009 RefGen RefGenProgress_2009-03
- February 2009 RefGen RefGenProgress_2009-02
- September 2008 RefGen RefGenProgress_2008-09-10
- August 2008 RefGen RefGenProgress_2008-08-13
- July 2008 RefGen RefGenProgress_2008-07-18
- June 2008 RefGen RefGenProgress_2008-06-18
- May 2008 RefGen RefGenProgress_2008-06-04
- April 20, 2008 Second Reference Genome Annotation Meeting, Salt Lake City, UT [Minutes]
- Sept 27, 2007: First Reference Genome Annotation Meeting, Princeton, NJ [Minutes]
Communication
Reference Genome Mailing list
Conference Calls
Meetings
Electronic jamborees
Annotation Targets
Suggestions for RefG annotation targets
Panther gene lists
From May 2008
Target Gene List (May 2008-)
Target Gene List August 2006-April 2008
- Access requires your email to be added to the system. Email Pascale if you would like to be added.
- This spreadsheet contains links to separate spreadsheets maintained by each of the reference genome groups.
Procedure for selection of target genes
Procedure for filling Genome-Specific spreadsheets
GAFs for trees-based annotations
Standard Operating Procedure for Tree-based propagation of annotations
Annotation_pipeline
By Judy, Suzi, Michael
Annotation Issues and Management
Annotation Progress
Gene Annotation wiki pages
- The purpose of these pages are to allow discussions of annotation and orthology issues related to particular genes. The individual gene pages are to be created as needed.
Lung Development related Gene Annotation Progress
- November 2009 lung development genes
- December 2009 lung branching morphogenesis genes
- January/February 2010_lung targets lung branching morphogenesis, continued
Graphical views of the annotations:
Selected refG target sets
- PPOD clusters selected since April 2008
- Manually curated target sets selected before April 2008
All PPOD clusters with at least one object from each of the twelve refG organisms
Reference Genomes Metrics | Metrics: Discussion on annotation progress measurements
Orthology determination
List of potentially problematic families for all vs. all BLAST methods of orthology determination
Data used to make orthology calls
reference proteomes files
At the July 2009 Quest for Orthologs meeting, it was agreed to decide upon a standard set of genomes, and compile "complete" sets of protein coding genes for each genome, and a representative protein sequence for each gene.
New gene2geneproduct file
At the April 2009 Reference Genome meeting it was decided to create a new file to replace the GP2protein file, called 'gene2geneproduct'. Specifications can be found on this page (will be added soon).
GAF file 2.0
The GAF file should contain 17 columns, and the meaning of columns 2, 12 and 17 have been modified. See that page for specifications.
Data used for Running P-POD orthology tool on the reference genomes gene set
by Kara Dolinski at Princeton - Nov2007
- This page contains a description of the project and the requirements for providing files for the P-POD analysis.
GFF3 sequence files for reference genome MODs
Reference_Genome_sequence_annotation
Software/database development
The purpose of this page is to discuss features and requirements that would be desirable in a database used to replace the existing Google Spreadsheet system for managing target genes, their annotations and metrics.
Branding Ref.Genome Project
Ideas for publicizing Ref.Genome Annotation Data
Retired Pages
Those pages are kept as reference but the information in them is not the most current information.
Tools for orthology determination
A summary of tools available to identify orthologs.
SOP for determining ortholog (by database)
- The purpose of this page was to discuss the method by which each group establishes orthology between reference genome genes and human disease genes.
We now collaborate with PANTHER and POPOD to provide that. (Issues are different)
Reference Genome Web Page Draft
- We now have a real web page!
