GO Reference Genome Meeting: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
|||
(32 intermediate revisions by 7 users not shown) | |||
Line 15: | Line 15: | ||
*Dinner - TBA | *Dinner - TBA | ||
==High Level Topics Identified for Discussion== | ==High Level Topics Identified for Discussion== | ||
* Strategies to identify orthologs | * Strategies to identify orthologs | ||
* How to prioritize | * How to prioritize genes | ||
* How to assess the progress made towards curation of reference genome genes; strategies for improvement | * How to assess the progress made towards curation of reference genome genes; strategies for improvement | ||
* Discussions regarding metrics, including making a plan for how to use metrics | * Discussions regarding metrics, including making a plan for how to use metrics | ||
Line 28: | Line 26: | ||
* Outreach | * Outreach | ||
== | == Agenda == | ||
===''Morning of the 26th''=== | |||
==''Morning of the 26th''== | |||
---- | ---- | ||
====Orthology==== | ====Orthology==== | ||
Moderator: Kara Dolinski | Moderator: Kara Dolinski | ||
Line 52: | Line 46: | ||
Papers of Interest: | Papers of Interest: | ||
# Alexeyenko, A., Lindberg, J., Perez-Bercoff, A., and Sonnhammer E.L.L. 2006. Overview and comparison of ortholog databases. Drug Discovery Today: Technologies 3:137-143. | |||
# Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507. | # Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507. | ||
# Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31. | # Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31. | ||
Line 57: | Line 52: | ||
====GRIN conference call==== | ====GRIN conference call==== | ||
* GRIN (Genome Research Informatics Network) conference call tentatively 11 am | |||
==''Afternoon of the 26th ''== | ===''Afternoon of the 26th ''=== | ||
---- | ---- | ||
* Priorities - Moderators Rex Chisholm and | * Metrics - Moderator Michael Ashburner | ||
* Methods | * Priorities - Moderators Rex Chisholm and Pascale Gaudet | ||
* Methods - Moderator Suzi Lewis | |||
* Tools - Moderator: Chris Mungall | * Tools - Moderator: Chris Mungall | ||
====Metrics==== | |||
Moderator: Michael Ashburner | |||
Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics. | |||
*Karen: Structural sequence annotations by comparison of the GFF3 provided by the reference genome groups. | |||
**Each reference genome must provide its sequence as GFF3 file. [http://wiki.geneontology.org/index.php/Reference_Genome_sequence_annotation View table of the reference genome MODs GFF3] | |||
**Plan for metric resource [[GO_Reference_Genome_Meeting_Metric_Plan]] | |||
*Chris: Review of our progress to date by examining what is actually in the database | |||
*Mike: Discussion of additional metrics and their consistent use | |||
* Ruth had some thoughts on literature measures [[Metrics:_breath_and_depth_of_annotations]] | |||
====Priorities==== | ====Priorities==== | ||
How to prioritize | How to prioritize genes | ||
* | * By Disease | ||
* There is an effort to cluster genes involved in the same disease or with the same or | ** (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease | ||
related function to facilitate the curatorial effort | ** There is an effort to cluster genes involved in the same disease or with the same or related function to facilitate the curatorial effort | ||
* Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ? | ** Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ? | ||
* Discuss pathways as an alternative method of prioritizing genes | |||
====Methods==== | ====Methods==== | ||
Moderator: Suzi Lewis | |||
Expounders: Judy Blake and Rex Chisholm | |||
Discussion points: | |||
# How to balance curation of experimental literature and ISS inference annotations work? | |||
# How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms? | |||
# How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred? | |||
# How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot'] | |||
====Tools==== | ====Tools==== | ||
Moderator: Chris Mungall | Moderator: Chris Mungall | ||
* Chris, Sohel and Mary are developing a web-based tool that will replace the current [http://dcn.spreadsheets.google.com/ccc?id=o16926456948884040128.4584390909151853752.07000735126025259412.442372083524637957 Google spreadsheet] | |||
* Demonstration of the tool (link to a page with the tool coming soon) [http://rails-dev.bioinformatics.northwestern.edu:24000/index.html Sohel's last version of the tool] | |||
* Curator input for further development | |||
* Database | |||
* AmiGO | |||
==''Morning of the 27th''== | ===''Morning of the 27th''=== | ||
---- | ---- | ||
* annotation consistency | * annotation consistency | ||
* promotion of resource | * promotion of resource | ||
Annotation Consistency | ====Annotation Consistency ==== | ||
Moderator: Pascale Gaudet | |||
====Promotion of Resource ==== | |||
Moderator: Susan Tweedie | |||
Discussion Points: | |||
# Public view of reference genome project and annotations through GO website and AmiGO. | |||
# Publication | |||
# Other promotion efforts | |||
[[Category:Meetings]] |
Latest revision as of 11:31, 20 August 2009
Sept 26-27th, 2007 Princeton, New Jersey
Wednesday, September 26, 2007
- Meeting Location: Frist Campus Center - Multipurpose Room C
- Lunch Meal Location: Frist Campus Center - Food Gallery
- Coffee/Tea/Water service for 30 people during the morning and afternoon sessions
- Lunch - $8.50/person on meal cards to be used in the Food Gallery
- Dinner - TBA
Thursday, September 27, 2007
- Meeting Location: Frist Campus Center - Multipurpose Room B
- Lunch Meal Location: Frist Campus Center - Food Gallery
- Coffee/Tea/Water service for 25 people during the morning and afternoon sessions
- Lunch - $8.50/person on meal cards to be used in the Food Gallery
- Dinner - TBA
High Level Topics Identified for Discussion
- Strategies to identify orthologs
- How to prioritize genes
- How to assess the progress made towards curation of reference genome genes; strategies for improvement
- Discussions regarding metrics, including making a plan for how to use metrics
- Review of progress toward database and tool development
- Annotation consistency discussion
- Outreach
Agenda
Morning of the 26th
Orthology
Moderator: Kara Dolinski
Resources:
- The Reference Genome groups have provided descriptions of their current methodology for establishing orthologs for their gene sets. These are available on the Orthology discussion page.
Discussion points:
- Is it necessary for all groups to use the same methodology in order to create orthology/homology data sets for the reference genome project?
- Standardize/describe procedure for identification across MODs
- How stringent and consistent do we want the language of 'orthology' and 'homology' to be incorporated into our documentation and discussion?
- How will we update our orthology sets with new genome builds or do we consider the current genomes in the reference genome project to be essentially complete?
- What is the impact of closely-related paralogs that have different functions on this project?
- Which model organisms are available in which databases, e.g. Dicty is not in Treefam; zebra fish & chicken are not in YOGY
- use-case examples (Kimberley wormbase, also Donghui?)
- Emily: GOA discussion about inheriting annotations
Papers of Interest:
- Alexeyenko, A., Lindberg, J., Perez-Bercoff, A., and Sonnhammer E.L.L. 2006. Overview and comparison of ortholog databases. Drug Discovery Today: Technologies 3:137-143.
- Dolinski, K. and Botstein, D. 2007. Orthology and Functional Conservation in Eukaryotes. Annu. Rev. Genet. 41:463-507.
- Hulsen, T., Huynen, M.A., de Vlieg, J., and Groenen, P.M.A. 2007. Benchmarking ortholog identification methods using functional genomics data. Genome Biology 7:R31.
- Wapinski, I., Pfeffer, A., Friedman, N., and Regew, A. 2007. Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23:1549-1558.
GRIN conference call
* GRIN (Genome Research Informatics Network) conference call tentatively 11 am
Afternoon of the 26th
* Metrics - Moderator Michael Ashburner * Priorities - Moderators Rex Chisholm and Pascale Gaudet * Methods - Moderator Suzi Lewis * Tools - Moderator: Chris Mungall
Metrics
Moderator: Michael Ashburner
Metrics are required to measure own annotation progress. We will use both functional and structural information in these metrics.
- Karen: Structural sequence annotations by comparison of the GFF3 provided by the reference genome groups.
- Each reference genome must provide its sequence as GFF3 file. View table of the reference genome MODs GFF3
- Plan for metric resource GO_Reference_Genome_Meeting_Metric_Plan
- Chris: Review of our progress to date by examining what is actually in the database
- Mike: Discussion of additional metrics and their consistent use
- Ruth had some thoughts on literature measures Metrics:_breath_and_depth_of_annotations
Priorities
How to prioritize genes
- By Disease
- (Rex and Pascale): OMIM morbid map; also occasionally we find genes not in Morbid Map that have strong evidence for involvement in a disease
- There is an effort to cluster genes involved in the same disease or with the same or related function to facilitate the curatorial effort
- Questions: is there a more systematic way? should we target some diseases more specifically? What about multigene diseases ?
- Discuss pathways as an alternative method of prioritizing genes
Methods
Moderator: Suzi Lewis Expounders: Judy Blake and Rex Chisholm Discussion points:
- How to balance curation of experimental literature and ISS inference annotations work?
- How to balance prioritization of genes by importance to human disease processes and by presence in yeast and smaller organisms?
- How to measure 'comprehensiveness' of annotation and to know when sufficient curation of literature has occurred?
- How to prioritize new curation for already 'done' genes...for example, hot new papers that report new information about already 'completed' genes [Bmp4, Cav, for example, currently very 'hot']
Tools
Moderator: Chris Mungall
- Chris, Sohel and Mary are developing a web-based tool that will replace the current Google spreadsheet
- Demonstration of the tool (link to a page with the tool coming soon) Sohel's last version of the tool
- Curator input for further development
- Database
- AmiGO
Morning of the 27th
* annotation consistency * promotion of resource
Annotation Consistency
Moderator: Pascale Gaudet
Promotion of Resource
Moderator: Susan Tweedie
Discussion Points:
- Public view of reference genome project and annotations through GO website and AmiGO.
- Publication
- Other promotion efforts