Reference Genome Annotation Project Summary
GO Reference Genome Annotation Team Report
26th April 2010
This document outlines the plans for the GO annotation team for Reference Genomes.
Purpose of the Reference Genome Annotation Group
The complete annotation of “reference genomes”, that is, organisms for which there is primary experimental data. The complete annotation consists of two aspects: annotating all genes with experimental data (breadth); and annotate all genes to the maximal feasible quality for each gene product. There are currently twelve important model organisms for which expert biocurators provide annotations: Arabidopsis thaliana (Taxon ID: 3702), Caenorhabditis elegans (Taxon ID: 6239), Escherichia coli (Taxon ID: 511145), Danio rerio (Taxon ID: 7955), Dictyostelium discoideum (Taxon ID: 44689), Drosophila melanogaster (Taxon ID: 7227 ), Gallus gallus (Taxon ID: 9031), Homo sapiens (Taxon ID: 9606), Mus musculus (Taxon ID: , 10090), Rattus norvegicus (Taxon ID:10116 ), Saccharomyces cerevisiae (Taxon ID: 4932), and Schizosaccharomyces pombe (Taxon ID: 4896).
The Reference Genome GO Annotation Team, which includes representatives from each MOD annotation group, will facilitate implementation of GO Consortium annotation standards by providing regular feedback reports to the annotation group. They will coordinate annotation by determining priorities and selecting annotation target sets on a regular basis. They will measure the rate of progress toward the goal of broad and deep annotation of the reference genomes, and devise improvements to the annotation strategies needed to improve the efficiency of the overall process. This group centralizes the review of annotations by MODs within the GO consortium and provides key input on annotation standards.
Pascale Gaudet and Kara Dolinski
Responsibilities and Deliverables
- Determine annotation priorities and select appropriate annotation target sets for literature curation by the MODs.
- Coordinate the development of annotation topic meetings taking suggestions from all GOC members and soliciting the participation of external experts.
- Monitor progress towards comprehensive biological coverage and where necessary devise operational changes and recommendations to make the end-to-end process more efficient.
- Annotate by inference from the literature/experimentally based-annotations and deliver valid GAF files on a regular basis to the MODs.
- Provide written recommendations to the annotation managers to improve consistency in literature-curation.
The working group consists of the reference genome managers and GO curators for each of the reference genomes. It also includes the Panther DB group for generation of phylogenetic trees and developers of the PAINT software.
The GO annotation team has monthly phone conferences, each second Tuesday. Communication will also occurs via email and through face to face meetings, at the GO Annotation Camp and GO meeting.
Metrics of success
The primary metric of success will be the progress toward the goal of complete annotation of reference genomes. This is measured in terms of number of families annotated and number of gene products annotated.
- Ontology: Provide suggestions for content development; provide feedback on the ontology where the ontology structure (terms, definitions or relationships) makes annotations ambiguous.
- Annotation: Provide the Reference Genome group with quality assurance metrics for annotation consistency. Advise on community user requirements specific to reference genome data.
- Infrastructure: Provide feedback on PAINT, database tracking.
- External: Liaisons with external experts, Reactome, and UniProtKB.
- Conference call minutes and monthly progress reports are public: Reference_Genome_Annotation_Project#Progress_Reports
- PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, Thomas PD. Nucleic Acids Res. 2010 Jan;38(Database issue):D204-10. Epub 2009 Dec 16. PMID 20015972
- The Gene Ontology's Reference Genome Project: a unified framework for functional annotation across species. Reference Genome Group of the Gene Ontology Consortium. PLoS Comput Biol. 2009 Jul;5(7):e1000431. Epub 2009 Jul 3.PMID 19578431
Posters and presentations
- The Gene Ontology's Reference Genome Project. Poster presented by Pascale Gaudet at the International Dicty Meeting, Estes Park, CO, September 2009
- The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species. Talk presented by Pascale Gaudet at the Quest for Orthologs meeting, Hinxton, UK, July 2010.
- PAINT: Protein Annotation Inferencing Tool. Talk presented by Suzanna Lewis at the Quest for Orthologs meeting, Hinxton, UK, July 2009.
- PAINTING GO ANNOTATIONS ACROSS GENOMES. Poster presented by Mike Livstone at the CSHL Genome Informatics meeting, 2009
- The reference genome project. Talk presented by Pascale Gaudet at the 3rd International Biocuration Conference, Berlin, Germany, April 20009.
- The GO Reference Genome Annotation Project. Poster presented by Susan Tweedie at the 2nd International Biocuration Conference, San Jose, CA, October 2007.