GO Reference Genome Meeting

From GO Wiki
Revision as of 08:41, 26 September 2007 by Suzi (talk | contribs) (→‎Metrics)
Jump to navigation Jump to search

Sept 26-27th, 2007 Princeton, New Jersey

Wednesday, September 26, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room C
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 30 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA

Thursday, September 27, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room B
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 25 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA

Agenda Details, Discussions and Topics

High Level Topics Identified for Discussion

  • Strategies to identify orthologs
  • How to prioritize disease genes
  • How to assess the progress made towards curation of reference genome genes; strategies for improvement
  • Discussions regarding metrics, including making a plan for how to use metrics
  • Review of progress toward database and tool development
  • Annotation consistency discussion
  • Outreach

Schedule (last updated September 10, 2007)

Morning of the 26th


  * orthology
  * GRIN (Genome Research Informatics Network) conference call tentatively 11 am

Orthology

Moderator: Kara Dolinski

Resources:

  1. The Reference Genome groups have provided descriptions of their current methodology for establishing orthologs for their gene sets. These are available on the Orthology discussion page.

Discussion points:

  1. Is it necessary for all groups to use the same methodology in order to create orthology/homology data sets for the reference genome project?
  2. Standardize/describe procedure for identification across MODs
  3. How stringent and consistent do we want the language of 'orthology' and 'homology' to be incorporated into our documentation and discussion?
  4. How will we update our orthology sets with new genome builds or do we consider the current genomes in the reference genome project to be essentially complete?
  5. What is the impact of closely-related paralogs that have different functions on this project?
  6. Which model organisms are available in which databases, e.g. Dicty is not in Treefam; zebra fish & chicken are not in YOGY
  7. use-case examples (Kimberley wormbase, also Donghui?)
  8. Emily: GOA discussion about inheriting annotations

Papers of Interest:

  1. Alexeyenko, A., Lindberg, J., Perez-Bercoff, A., and Sonnhammer E.L.L. 2006. Overview and comparison of ortholog databases. Drug Discovery Today: Technologies 3:137-143.
  2. Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507.
  3. Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31.
  4. Wapinski, I., Pfeffer, A., Friedman, N., and Regew, A. 2007. Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23:1549-1558.

GRIN conference call

Afternoon of the 26th


  * Priorities - Moderators Rex Chisholm and Pascale Gaudet
  * Methods - 
  * Metrics - Moderator Ruth Lovering
  * Tools - Moderator: Chris Mungall

Priorities

How to prioritize disease genes

  • Currently (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
  • There is an effort to cluster genes involved in the same disease or with the same or

related function to facilitate the curatorial effort

  • Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?

Methods

Moderator: Suzi Lewis Expounders: Judy Blake and Rex Chisholm Discussion points:

  1. How to balance curation of experimental literature and ISS inference annotations work?
  2. How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms?
  3. How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred?
  4. How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot']

Metrics

Moderator: Mike Cherry

Ruth provided this stating point http://gocwiki.geneontology.org/index.php/Metrics:_breath_and_depth_of_annotations

  • Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics.
  • Chris: Review of our progress to date by examining what is actually in the database
  • Mike: Discussion of additional metrics and their consistent use

Tools

Moderator: Chris Mungall

Topics

  1. Database: moving out of Google spreadsheet

Morning of the 27th


  * annotation consistency
  * promotion of resource

Annotation Consistency

Moderator: Pascale Gaudet

Promotion of Resource

Moderator: Susan Tweedie

Discussion Points:

  1. Public view of reference genome project and annotations through GO website and AmiGO.
  2. Publication
  3. Other promotion efforts