2012 Annotation Meeting Stanford

From GO Wiki
Revision as of 10:19, 15 December 2011 by Suzi (talk | contribs) (Proposed Agenda)

Jump to: navigation, search



  • GO annotations are the primary product of the GO: we need a defined process for how they will be produced efficiently and at high quality
  • The goals of the meeting are:
    • to describe our annotation goals
    • to discuss and describe the components of a process designed to meet these goals
    • to create "SWAT teams" in areas where we need further development to meet these goals

Proposed Agenda

Sunday afternoon: 1pm until 4:30pm

"How do we become an Efficient Annotation Factory?"

Current Status of Annotation Production (Mike Cherry) i.e. Where are we now with basic annotation?

  • What is our current rate of new annotation production
  • What is our current rate of annotation loss (due to sunset clause, average % of annotations that can't be loaded, etc.)
  • Annotation Information Content assessment (how detailed are the existing annotations and what has been the trend over time)
  • Completeness and adherence to standards of "gp2protein" files

Discussion: Goals: the "gold standard" for GO annotations

  • Minimal requirements for submitting GO annotations (for projects and MODs not funded via the GOC)
  • Depth of annotation: What are the components of a "gold standard" GO annotation?

SUZI, interrupted mid-way, still have more work to do

  • Post-meeting action item: practical plan for getting from the current state to the "gold standard" (or as close to it as possible)

Discussion: what should the annotation process be?

  • Process and Lessons learned from previous literature curation efforts. Focus is on how the processes might be generalized, with specific details only as supporting examples.
    • domain-specific curation and ontology development
      • transcription overhaul
      • apoptosis annotation
    • Swiss-Prot GO term curation process
    • Wiki-based annotation in CACAO: proposed improvements and potential generalizations
  • Phylogenetic inference: Synthesis, QA and inference across organisms using PAINT
  • Post-meeting action items: practical plans to implement process, and for how individual curation groups can contribute


Making the GO annotation process of the future

Towards a common annotation framework

* Kimberly to report on spec for CAF.

  • Kimberly is currently talking to all curation groups about individual GO annotation tools, what features they have and what features curators would like.
  • Therefore, by the GO Consortium meeting, Kimberly will be able to present the features that GOC curators feel are most important.
  • Subsequent discussion on:
    • any other aspects curators would require in an annotation tool.
    • What additional data should be supplied by annotation groups
    • How best to use textmining in the CAF for prioritizing curation work (e.g. Textpresso)

* Val to report on the Community Annotation Tool (PomBase)

    • This is a PomBase tool that is being developed by Kim to include GOC requirements to make it become available to community experts, who would like to submit small sets of GO annotations to the GO Consortium, which would then need to be reviewed by GOC groups. (Kim and PomBase will be keeping Kimberly and the CAF working group in the loop as to developments)
    • Discussion on how best to advertise tool to community and how to manage annotation submissions within the Consortium.

Phylogenetic inference process


  • Breadth of annotation: How can MODs achieve full genome coverage?

A focused annotation session for ~10 GO annotators (limit decided due to need for the session to be manageable and productive). Led by Pascale.

    • Annotators would be selected on the basis:

- as well as those with previous training in PAINT annotation (e.g. Mike L., Rama, Li Ni, Donghui) - no training, however strong possibility in using PAINT later on to create GO annotations (e.g. GO NIH funded curators)

    • Annotations to transfer would be selected on the basis of recent annotation work by GO Consortium groups that are now in the GO database, to terms from the ontology which have been reviewed and likely to remain stable (e.g. from the recent transcription annotation effort)
    • Time required: minimum: 5 hours.


Evaluating efficiency

metrics discussion:

    • how best to measure annotation progress?
    • Possible stats: Count of new terms used in annotation? Count of comprehensively annotated gene products? Count of EXP-evidenced annotations, Count of species with new annotation sets? Count of new checks implemented?
    • what combination of stats would best reflect our curation efforts?
    • How can the selected set of metrics be most effectively created, what information do groups need to be ready to supply the GOC with?

Making annotation public

    • How do we enable intelligent consumption of GO and annotations, especially of new functionality/expressivity?

Preparation needed in advance of GO Consortium meeting

1. Develop proposal for annotation process (Rama, Emily, Kimberly), using examples from transcription overhaul (Karen), apoptosis (Paola), Swiss-Prot manual curation (Rolf/Claire?), CACAO (Jim Hu), Phylogenetic annotation (Pascale)

2. Develop proposal for gold standard annotations (GO-tops, managers)

3. Develop proposal for aiding in GO consumption (GO-tops, managers)

4. Develop proposal for metrics (GO-tops, managers)