GO Reference Genome Meeting

From GO Wiki
Jump to: navigation, search

Sept 26-27th, 2007 Princeton, New Jersey

Wednesday, September 26, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room C
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 30 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA

Thursday, September 27, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room B
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 25 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA


High Level Topics Identified for Discussion

  • Strategies to identify orthologs
  • How to prioritize genes
  • How to assess the progress made towards curation of reference genome genes; strategies for improvement
  • Discussions regarding metrics, including making a plan for how to use metrics
  • Review of progress toward database and tool development
  • Annotation consistency discussion
  • Outreach

Agenda

Morning of the 26th


Orthology

Moderator: Kara Dolinski

Resources:

  1. The Reference Genome groups have provided descriptions of their current methodology for establishing orthologs for their gene sets. These are available on the Orthology discussion page.

Discussion points:

  1. Is it necessary for all groups to use the same methodology in order to create orthology/homology data sets for the reference genome project?
  2. Standardize/describe procedure for identification across MODs
  3. How stringent and consistent do we want the language of 'orthology' and 'homology' to be incorporated into our documentation and discussion?
  4. How will we update our orthology sets with new genome builds or do we consider the current genomes in the reference genome project to be essentially complete?
  5. What is the impact of closely-related paralogs that have different functions on this project?
  6. Which model organisms are available in which databases, e.g. Dicty is not in Treefam; zebra fish & chicken are not in YOGY
  7. use-case examples (Kimberley wormbase, also Donghui?)
  8. Emily: GOA discussion about inheriting annotations

Papers of Interest:

  1. Alexeyenko, A., Lindberg, J., Perez-Bercoff, A., and Sonnhammer E.L.L. 2006. Overview and comparison of ortholog databases. Drug Discovery Today: Technologies 3:137-143.
  2. Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507.
  3. Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31.
  4. Wapinski, I., Pfeffer, A., Friedman, N., and Regew, A. 2007. Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23:1549-1558.

GRIN conference call

  * GRIN (Genome Research Informatics Network) conference call tentatively 11 am

Afternoon of the 26th


  * Metrics - Moderator Michael Ashburner
  * Priorities - Moderators Rex Chisholm and Pascale Gaudet
  * Methods - Moderator Suzi Lewis
  * Tools - Moderator: Chris Mungall

Metrics

Moderator: Michael Ashburner

Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics.

  • Chris: Review of our progress to date by examining what is actually in the database
  • Mike: Discussion of additional metrics and their consistent use

Priorities

How to prioritize genes

  • By Disease
    • (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
    • There is an effort to cluster genes involved in the same disease or with the same or related function to facilitate the curatorial effort
    • Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?
  • Discuss pathways as an alternative method of prioritizing genes

Methods

Moderator: Suzi Lewis Expounders: Judy Blake and Rex Chisholm Discussion points:

  1. How to balance curation of experimental literature and ISS inference annotations work?
  2. How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms?
  3. How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred?
  4. How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot']

Tools

Moderator: Chris Mungall

  • Chris, Sohel and Mary are developing a web-based tool that will replace the current Google spreadsheet
  • Demonstration of the tool (link to a page with the tool coming soon) Sohel's last version of the tool
  • Curator input for further development
  • Database
  • AmiGO

Morning of the 27th


  * annotation consistency
  * promotion of resource

Annotation Consistency

Moderator: Pascale Gaudet

Promotion of Resource

Moderator: Susan Tweedie

Discussion Points:

  1. Public view of reference genome project and annotations through GO website and AmiGO.
  2. Publication
  3. Other promotion efforts