Reference Genome Annotation Project Summary: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(14 intermediate revisions by 5 users not shown)
Line 1: Line 1:
[[Category:Reports]]
GO Reference Genome Annotation Team Report  
GO Reference Genome Annotation Team Report  
25nd April 2006
This report outlines the plans for the GO annotation team for Reference Genomes.
==Purpose==
;What is the group’s purpose?
:The GO consortium has established the complete annotation of nine reference genomes as a priority goal.  These reference genomes are <i>Escherichia coli</i>, <i>Homo sapiens</i>, <i>Drosophila melanogaster</i>, <i>Caenorhabditis elegans</i>, <i>Saccharomyces cerevisiae</i>, <i>Arabadopsis thaliana</i>, <i>Danio rerio</i>, <i>Dictyostelium discoideum</i>, and <i>Mus musculus</i>.  The Reference Genome GO Annotation Team, with representatives from each genome annotation group, will coordinate annotation, facilitate implementation of GO Consortium annotation priorities, provide metrics to assess progress toward the goal of broad and deep annotation of the reference genomes.
;What makes this group necessary and unique?
:This group will be responsible for the coordination of the annotation of the nine reference genomes. This group represents the annotation expertise within the GO consortium and provides key liaisons to the model organism databases the have primary responsibilities for the annotation of the reference genomes.
;What is the lifespan?
:The availability of fully annotated genomes is essential for the transfer of GO annotations to other genomes. The reference genome working group will be active  as long as the complete annotation of reference genomes remains a priority for the GO Consortium.


==Group Leader==
26th April 2010
The Coordinator of Reference Genome Annotation will coordinate the activities of the group. Rex Chisholm currently has been assigned this role.  Coordination will occur through regular phone conferences, working group meetings coincident with other Consortium activities (Annotation Camps, GOC meetings). The working group will also interface with the software development working group to develop tracking tools to aid in monitoring progress in Reference Genome annotation.
 
==Activities==
This document outlines the plans for the GO annotation team for Reference Genomes.
;What are the key deliverables of this group?
 
*Perform broad and deep annotation of the nine reference genomes as described in Section A1.
==Purpose of the Reference Genome Annotation Group==
*Establish metrics that enable monitoring progress toward the goal of broad and deep annotation.
The complete annotation of “reference genomes”, that is, organisms for which there is primary experimental data. The complete annotation consists of two aspects: annotating all genes with experimental data (breadth); and annotate all genes to the maximal feasible quality for each gene product. There are currently twelve important model organisms for which expert biocurators provide annotations: Arabidopsis thaliana (Taxon ID: 3702), Caenorhabditis elegans (Taxon ID: 6239), Escherichia coli (Taxon ID: 511145), Danio rerio (Taxon ID: 7955), Dictyostelium discoideum (Taxon ID: 44689), Drosophila melanogaster (Taxon ID: 7227 ), Gallus gallus (Taxon ID: 9031), Homo sapiens (Taxon ID: 9606), Mus musculus (Taxon ID: , 10090), Rattus norvegicus (Taxon ID:10116 ), Saccharomyces cerevisiae (Taxon ID: 4932), and Schizosaccharomyces pombe (Taxon ID: 4896).
*Collect annotation metrics from reference genome databases.
 
*Produce reports that document progress toward the goal of complete annotation of reference genomes
 
;What criteria are used to set priorities?
The Reference Genome GO Annotation Team, which includes representatives from each MOD annotation group, will facilitate implementation of GO Consortium annotation standards by providing regular feedback reports to the annotation group. They will coordinate annotation by determining priorities and selecting annotation target sets on a regular basis. They will measure the rate of progress toward the goal of broad and deep annotation of the reference genomes, and devise improvements to the annotation strategies needed to improve the efficiency of the overall process. This group centralizes the review of annotations by MODs within the GO consortium and provides key input on annotation standards.
*Currently annotation of human genes involved in disease and their orthologs in reference genomes is the top priority.
 
*A list of human disease related genes that represent annotation targets for the reference genomes will be generated.  
 
==Group Managers==
Pascale Gaudet and Kara Dolinski
 
 
==Responsibilities and Deliverables==
# Determine annotation priorities and select appropriate annotation target sets for literature curation by the MODs.
# Coordinate the development of annotation topic meetings taking suggestions from all GOC members and soliciting the participation of external experts.
# Monitor progress towards comprehensive biological coverage and where necessary devise operational changes and recommendations to make the end-to-end process more efficient.
# Annotate by inference from the literature/experimentally based-annotations and deliver valid GAF files on a regular basis to the MODs.
# Provide written recommendations to the annotation managers to improve consistency in literature-curation.
 
 
==Members==
==Members==
The working group will consist of the GO annotation coordinator for each of the reference genomes.  Currently this includes, Pascale Gaudet (dictyBase), Tanya Berardini (TAIR), David Hill (MGI), Susan Tweedie (FlyBase), Doug Howe (ZFIN), Evelyn Camon (GOA), Michelle Giglio (Microbial Genomes), Karen Christie (SGD), and Rex Chisholm
The working group consists of the reference genome managers and GO curators for each of the reference genomes.  It also includes the Panther DB group for generation of phylogenetic trees and developers of the PAINT software.
 


==Meeting calendar==
==Meeting calendar==
The GO annotation team will establish a regular schedule of phone conferences, most likely monthly.  The frequency of phone conferences will be reviewed regularly to assure that it is meeting the needs of the group. Communication will also occur via email and through face to face meetings, most likely at the GO Annotation Camp and GO meeting.
The GO annotation team has monthly phone conferences, each second Tuesday. Communication will also occurs via email and through face to face meetings, at the GO Annotation Camp and GO meeting.
 
==Metrics of success==
==Metrics of success==
The primary metric of success will be the progress toward the goal of complete annotation of reference genomes.  An important aspect of the activities of this working group is the development of metrics to monitor both breadth and depth of genome annotation. The progress of these metrics will be the major measures of success.
The primary metric of success will be the progress toward the goal of complete annotation of reference genomes.  This is measured in terms of number of families annotated and number of gene products annotated.
==Linkages==
 
*Fortnightly GO Annotation team phone conferences
 
*Quarterly reports to GO Directors
==Interactions==
*Coordination with GO outreach working group
# Ontology: Provide suggestions for content development; provide feedback on the ontology where the ontology structure (terms, definitions or relationships) makes annotations ambiguous.
*Coordination with MODs responsible for annotation of reference genomes
# Annotation: Provide the Reference Genome group with quality assurance metrics for annotation consistency. Advise on community user requirements specific to reference genome data.
==Process==
# Infrastructure: Provide feedback on PAINT, database tracking.
The detailed process will be established by a consensus of the working group itself. However two areas will receive attention first.
# External: Liaisons with external experts, Reactome, and UniProtKB.
 
==Reports==
* Conference call minutes and monthly progress reports are public: [[Reference_Genome_Annotation_Project#Progress_Reports]]
 
==Publications==
* PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, Thomas PD. Nucleic Acids Res. 2010 Jan;38(Database issue):D204-10. Epub 2009 Dec 16. PMID 20015972
 
* The Gene Ontology's Reference Genome Project: a unified framework for functional annotation across species. Reference Genome Group of the Gene Ontology Consortium. PLoS Comput Biol. 2009 Jul;5(7):e1000431. Epub 2009 Jul 3.PMID 19578431
 
 
==Posters and presentations==
 
* The Gene Ontology's Reference Genome Project. Poster presented by Pascale Gaudet at the International Dicty Meeting, Estes Park, CO, September 2009
 
* The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species. Talk presented by Pascale Gaudet at the Quest for Orthologs meeting, Hinxton, UK, July 2010.  


The first of these is establishment of metrics. Starting from the metrics described in the GO grant application the working group will establish a process to feed the necessary statistics to the GO servers.  In addition a task force from the working group will work with the software developers/systems team to develop a means of regularly updating these statistics using a process similar to how GO currently collection annotation statistics.
* PAINT: Protein Annotation Inferencing Tool. Talk presented by Suzanna Lewis at the Quest for Orthologs meeting, Hinxton, UK, July 2009.  


The second area will be the coordinated annotation of genes important for human disease and their orthologs in the non-human reference genomes. Early discussions with a subset of the group has suggested starting with mouse (MGI), rat (RGD) and GOA (human) annotators producing annotations for papers selected for their relevance to a particular class of human diseases, such as neurological disease.  Once genes have been identified in these papers, they would be passed along to the MODs responsible for the reference genomes which would be expected to coordinate the curation of the orthologs of these genes in their databases.  Once this process was complete one additional goal would be to assert that the annotations for these genes were “complete” across all of the reference genomes.  One additional metric might be the number of human disease genes that have complete annotations across all reference genomes.
* PAINTING GO ANNOTATIONS ACROSS GENOMES. Poster presented by Mike Livstone at the CSHL Genome Informatics meeting, 2009


By cycling through this process several times it should be possible to improve the quality of the coordination and the process itself.
* The reference genome project. Talk presented by Pascale Gaudet at the 3rd International Biocuration Conference, Berlin, Germany, April 20009.  


Question: are we addressing curator consistency anywhere? Or is that another working group's responsibility? It seems to me, by its position at the interface of the nine genomes, this group might want to address inter-database curation consistency.
* The GO Reference Genome Annotation Project. Poster presented by Susan Tweedie at the 2nd International Biocuration Conference, San Jose, CA, October 2007.




----
[[http://gocwiki.geneontology.org/index.php/Reference_Genome_Annotation_Project Reference Genome Main Page]]
[[http://gocwiki.geneontology.org/index.php/Reference_Genome_Annotation_Project Reference Genome Main Page]]

Latest revision as of 11:16, 12 April 2019

GO Reference Genome Annotation Team Report

26th April 2010

This document outlines the plans for the GO annotation team for Reference Genomes.

Purpose of the Reference Genome Annotation Group

The complete annotation of “reference genomes”, that is, organisms for which there is primary experimental data. The complete annotation consists of two aspects: annotating all genes with experimental data (breadth); and annotate all genes to the maximal feasible quality for each gene product. There are currently twelve important model organisms for which expert biocurators provide annotations: Arabidopsis thaliana (Taxon ID: 3702), Caenorhabditis elegans (Taxon ID: 6239), Escherichia coli (Taxon ID: 511145), Danio rerio (Taxon ID: 7955), Dictyostelium discoideum (Taxon ID: 44689), Drosophila melanogaster (Taxon ID: 7227 ), Gallus gallus (Taxon ID: 9031), Homo sapiens (Taxon ID: 9606), Mus musculus (Taxon ID: , 10090), Rattus norvegicus (Taxon ID:10116 ), Saccharomyces cerevisiae (Taxon ID: 4932), and Schizosaccharomyces pombe (Taxon ID: 4896).


The Reference Genome GO Annotation Team, which includes representatives from each MOD annotation group, will facilitate implementation of GO Consortium annotation standards by providing regular feedback reports to the annotation group. They will coordinate annotation by determining priorities and selecting annotation target sets on a regular basis. They will measure the rate of progress toward the goal of broad and deep annotation of the reference genomes, and devise improvements to the annotation strategies needed to improve the efficiency of the overall process. This group centralizes the review of annotations by MODs within the GO consortium and provides key input on annotation standards.


Group Managers

Pascale Gaudet and Kara Dolinski


Responsibilities and Deliverables

  1. Determine annotation priorities and select appropriate annotation target sets for literature curation by the MODs.
  2. Coordinate the development of annotation topic meetings taking suggestions from all GOC members and soliciting the participation of external experts.
  3. Monitor progress towards comprehensive biological coverage and where necessary devise operational changes and recommendations to make the end-to-end process more efficient.
  4. Annotate by inference from the literature/experimentally based-annotations and deliver valid GAF files on a regular basis to the MODs.
  5. Provide written recommendations to the annotation managers to improve consistency in literature-curation.


Members

The working group consists of the reference genome managers and GO curators for each of the reference genomes. It also includes the Panther DB group for generation of phylogenetic trees and developers of the PAINT software.


Meeting calendar

The GO annotation team has monthly phone conferences, each second Tuesday. Communication will also occurs via email and through face to face meetings, at the GO Annotation Camp and GO meeting.

Metrics of success

The primary metric of success will be the progress toward the goal of complete annotation of reference genomes. This is measured in terms of number of families annotated and number of gene products annotated.


Interactions

  1. Ontology: Provide suggestions for content development; provide feedback on the ontology where the ontology structure (terms, definitions or relationships) makes annotations ambiguous.
  2. Annotation: Provide the Reference Genome group with quality assurance metrics for annotation consistency. Advise on community user requirements specific to reference genome data.
  3. Infrastructure: Provide feedback on PAINT, database tracking.
  4. External: Liaisons with external experts, Reactome, and UniProtKB.

Reports

Publications

  • PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, Thomas PD. Nucleic Acids Res. 2010 Jan;38(Database issue):D204-10. Epub 2009 Dec 16. PMID 20015972
  • The Gene Ontology's Reference Genome Project: a unified framework for functional annotation across species. Reference Genome Group of the Gene Ontology Consortium. PLoS Comput Biol. 2009 Jul;5(7):e1000431. Epub 2009 Jul 3.PMID 19578431


Posters and presentations

  • The Gene Ontology's Reference Genome Project. Poster presented by Pascale Gaudet at the International Dicty Meeting, Estes Park, CO, September 2009
  • The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species. Talk presented by Pascale Gaudet at the Quest for Orthologs meeting, Hinxton, UK, July 2010.
  • PAINT: Protein Annotation Inferencing Tool. Talk presented by Suzanna Lewis at the Quest for Orthologs meeting, Hinxton, UK, July 2009.
  • PAINTING GO ANNOTATIONS ACROSS GENOMES. Poster presented by Mike Livstone at the CSHL Genome Informatics meeting, 2009
  • The reference genome project. Talk presented by Pascale Gaudet at the 3rd International Biocuration Conference, Berlin, Germany, April 20009.
  • The GO Reference Genome Annotation Project. Poster presented by Susan Tweedie at the 2nd International Biocuration Conference, San Jose, CA, October 2007.



[Reference Genome Main Page]