GO Reference Genome Meeting: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(29 intermediate revisions by 6 users not shown)
Line 15: Line 15:
*Dinner - TBA
*Dinner - TBA


==Agenda Details, Discusstions and Topics==


* [http://gocwiki.geneontology.org/index.php/Reference_Genome_Annotation_Meeting Agenda Details, Discussions and Materials]


==High Level Topics Identified for Discussion==
==High Level Topics Identified for Discussion==
* Strategies to identify orthologs
* Strategies to identify orthologs
* How to prioritize disease genes
* How to prioritize genes
* How to assess the progress made towards curation of reference genome genes; strategies for improvement
* How to assess the progress made towards curation of reference genome genes; strategies for improvement
* Discussions regarding metrics, including making a plan for how to use metrics
* Discussions regarding metrics, including making a plan for how to use metrics
Line 28: Line 26:
* Outreach
* Outreach


==Schedule (last updated September 10, 2007)==
== Agenda ==
 
===''Morning of the 26th''===
==''Morning of the 26th''==
----
----
  * orthology
  * GRIN (Genome Research Informatics Network) conference call tentatively 11 am
====Orthology====
====Orthology====
Moderator:  Kara Dolinski
Moderator:  Kara Dolinski
Line 58: Line 52:


====GRIN conference call====
====GRIN conference call====
  * GRIN (Genome Research Informatics Network) conference call tentatively 11 am


==''Afternoon of the 26th ''==
===''Afternoon of the 26th ''===
----
----
   * Priorities - Moderators Rex Chisholm and Pascal Gaudet
  * Metrics - Moderator Michael Ashburner
   * Methods -
   * Priorities - Moderators Rex Chisholm and Pascale Gaudet
  * Metrics - Moderator Ruth Lovering
   * Methods - Moderator Suzi Lewis
   * Tools - Moderator: Chris Mungall
   * Tools - Moderator: Chris Mungall
====Metrics====
Moderator: Michael Ashburner
Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics.
*Karen: Structural sequence annotations by comparison of the GFF3 provided by the reference genome groups.
**Each reference genome must provide its sequence as GFF3 file. [http://wiki.geneontology.org/index.php/Reference_Genome_sequence_annotation View table of the reference genome MODs GFF3]
**Plan for metric resource [[GO_Reference_Genome_Meeting_Metric_Plan]]
*Chris: Review of our progress to date by examining what is actually in the database
*Mike: Discussion of additional metrics and their consistent use
* Ruth had some thoughts on literature measures [[Metrics:_breath_and_depth_of_annotations]]


====Priorities====
====Priorities====
How to prioritize disease genes
How to prioritize genes
* Currently (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
* By Disease
* There is an effort to cluster genes involved in the same disease or with the same or  
** (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
related function to facilitate the curatorial effort
** There is an effort to cluster genes involved in the same disease or with the same or related function to facilitate the curatorial effort
* Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?
** Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?
* Discuss pathways as an alternative method of prioritizing genes


====Methods====
====Methods====
 
Moderator: Suzi Lewis
====Metrics====
Expounders: Judy Blake and Rex Chisholm
Moderator: Ruth Lovering
Discussion points:
 
# How to balance curation of experimental literature and ISS inference annotations work?
Ruth provided this stating point
# How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms?
http://gocwiki.geneontology.org/index.php/Metrics:_breath_and_depth_of_annotations
# How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred?
# How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot']


====Tools====  
====Tools====  
Moderator: Chris Mungall
Moderator: Chris Mungall


Topics
* Chris, Sohel and Mary are developing a web-based tool that will replace the current [http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957 Google spreadsheet]
# Database: moving out of Google spreadsheet
* Demonstration of the tool (link to a page with the tool coming soon) [http://rails-dev.bioinformatics.northwestern.edu:24000/index.html Sohel's last version of the tool]
* Curator input for further development
* Database
* AmiGO


==''Morning of the 27th''==
===''Morning of the 27th''===
----
----
   * annotation consistency
   * annotation consistency
Line 95: Line 109:
Moderator: Pascale Gaudet
Moderator: Pascale Gaudet


====Promotion of Resource (moderator Susan Tweedie)====
====Promotion of Resource ====
Moderator: Susan Tweedie
 
Discussion Points:
# Public view of reference genome project and annotations through GO website and AmiGO.
# Publication
# Other promotion efforts
 
[[Category:Meetings]]

Latest revision as of 11:31, 20 August 2009

Sept 26-27th, 2007 Princeton, New Jersey

Wednesday, September 26, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room C
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 30 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA

Thursday, September 27, 2007

  • Meeting Location: Frist Campus Center - Multipurpose Room B
  • Lunch Meal Location: Frist Campus Center - Food Gallery
  • Coffee/Tea/Water service for 25 people during the morning and afternoon sessions
  • Lunch - $8.50/person on meal cards to be used in the Food Gallery
  • Dinner - TBA


High Level Topics Identified for Discussion

  • Strategies to identify orthologs
  • How to prioritize genes
  • How to assess the progress made towards curation of reference genome genes; strategies for improvement
  • Discussions regarding metrics, including making a plan for how to use metrics
  • Review of progress toward database and tool development
  • Annotation consistency discussion
  • Outreach

Agenda

Morning of the 26th


Orthology

Moderator: Kara Dolinski

Resources:

  1. The Reference Genome groups have provided descriptions of their current methodology for establishing orthologs for their gene sets. These are available on the Orthology discussion page.

Discussion points:

  1. Is it necessary for all groups to use the same methodology in order to create orthology/homology data sets for the reference genome project?
  2. Standardize/describe procedure for identification across MODs
  3. How stringent and consistent do we want the language of 'orthology' and 'homology' to be incorporated into our documentation and discussion?
  4. How will we update our orthology sets with new genome builds or do we consider the current genomes in the reference genome project to be essentially complete?
  5. What is the impact of closely-related paralogs that have different functions on this project?
  6. Which model organisms are available in which databases, e.g. Dicty is not in Treefam; zebra fish & chicken are not in YOGY
  7. use-case examples (Kimberley wormbase, also Donghui?)
  8. Emily: GOA discussion about inheriting annotations

Papers of Interest:

  1. Alexeyenko, A., Lindberg, J., Perez-Bercoff, A., and Sonnhammer E.L.L. 2006. Overview and comparison of ortholog databases. Drug Discovery Today: Technologies 3:137-143.
  2. Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507.
  3. Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31.
  4. Wapinski, I., Pfeffer, A., Friedman, N., and Regew, A. 2007. Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23:1549-1558.

GRIN conference call

  * GRIN (Genome Research Informatics Network) conference call tentatively 11 am

Afternoon of the 26th


  * Metrics - Moderator Michael Ashburner
  * Priorities - Moderators Rex Chisholm and Pascale Gaudet
  * Methods - Moderator Suzi Lewis
  * Tools - Moderator: Chris Mungall

Metrics

Moderator: Michael Ashburner

Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics.

  • Chris: Review of our progress to date by examining what is actually in the database
  • Mike: Discussion of additional metrics and their consistent use

Priorities

How to prioritize genes

  • By Disease
    • (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
    • There is an effort to cluster genes involved in the same disease or with the same or related function to facilitate the curatorial effort
    • Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?
  • Discuss pathways as an alternative method of prioritizing genes

Methods

Moderator: Suzi Lewis Expounders: Judy Blake and Rex Chisholm Discussion points:

  1. How to balance curation of experimental literature and ISS inference annotations work?
  2. How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms?
  3. How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred?
  4. How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot']

Tools

Moderator: Chris Mungall

  • Chris, Sohel and Mary are developing a web-based tool that will replace the current Google spreadsheet
  • Demonstration of the tool (link to a page with the tool coming soon) Sohel's last version of the tool
  • Curator input for further development
  • Database
  • AmiGO

Morning of the 27th


  * annotation consistency
  * promotion of resource

Annotation Consistency

Moderator: Pascale Gaudet

Promotion of Resource

Moderator: Susan Tweedie

Discussion Points:

  1. Public view of reference genome project and annotations through GO website and AmiGO.
  2. Publication
  3. Other promotion efforts