GOC Meeting Minutes September 2009
Weds. AM -Judy review grant status -2yr project to fund CL development funded!
-focus on immunology and neurons
-Protein ontology development supplemental grant funded to represent complexes.
-Pro very specific representation..mouse proteins in mouse complex...GO componenet is representing the class (ie not species specific).
-each pro complex should have a xref to the GO complex. -complexes should be represented by their function, not their protein members -David-some complexes defined by proteins contained some by what they do. -Judy-function of a complex may be cell type specific -Chris - GO does complexes, PRO does proteins which are pointed to by GO component? -Judy-this is not how the process was described in the PRO grant -Ben-complexes defined by their function may introduce a naming issue...the same complex having different functions in different places or cell types. -Suzi-GO and PRO need to discuss how best to represent these complexes and relate to PRO to GO. - -Aim 2: -Funding obtained for protein set specification (panther group) -Judy-Gene targets may want to be skewed towards genes known involved in Human disease. Important for grant and ongoing funding -Rex-important to balance with genes having unknown function..emerging new information is important too..balance in choosing targets needed. -Ruth-there are human genes which we know alot about that are not annotated. people want the data..are surprised not to find it. -Judy-increasing concern that we are sorely lacking in annotation depth in areas we know alot about. Need more resources. -Rex- How do we define relevance to Human biology.? -Suzi-original REF Gen proposal gos for breadth and depth. Need ways to prioritize targets..does not preclude inclusion of targets outside target gene set. -Judy-can look at systems vs. diseases vs. processes
Aim 3 Species to consider outreach to
-daphnea (Indiana)-has GO annotations (Michael Linch) -Xenopus -Sea Urchin -plasmodium
-Brenley-GONUTS can support annotation of alternate species. -Rama-need an annotation camp..it’s been several years -need more ongoing training -Judy- paul, PAINT, need mechanism of annotation for new genomes, have GO jamboree, bring these new groups in. -Emily- electronic annoation pipelines lack development and improvement efforts. -Emily-more effort needed in expanding electronic efforts -David-automated annotations too high level, clustering at high level not that valuable for new groups. -Judy-need continuing discussion here to focus on utility of GO, what people want/need...how do we focus effort here for the next year...need to address to get continued funding.
-Rex-This was the REASON for the Ref Genome Proj..how can it’s role be expanded, why isn’t it working for non-ref gen species? -Micheal- EMO-emerging model organism -Paul-82 species chosen to be close to as many species as possible for ortholog prediction and functional annotation to related species. -SP/UNIPROT group increasing manual annotation efforts- -Kara-priorities for next year-outreach to ourselves...it’s not clear there is “MOD buyin” ..how is our own work being incorporated into our OWN databases? Peter Good wants to know “how does this work advance biomedical hypothesis generation?” -David-how do we get non-NIH funding from sources intersted in general biology? --perhaps there is a mis-perception of the state of things...PAINT annotations are still under development. -Susan-should we highlight refgenome annotations? Groups with one annotator get behind in curation -ACTION ITEM: MODs-make an effort to highlight Ref Genome annotation -David-data in paper highlights progress/effect of ref genome effort -Mike-do MODs announce or advertise their participation in RefGen. -Ruth-following up with ISS annotations is difficult -Judy-we are poised to make the next step of propagating annotations..this should highlight the annotations generated from the Ref.Gen. project -Rex-we should also for prog. report and website-provide metrics-number of Ref gen annotation...and % of databases that have those annotations propagated. -Real value will come from propagation of annotations to other species -Judy-if we can provide a set of annotations for one of the non-ref gen species in the next year this will be good demonstration of the power of the effort. This may be a good focus point for the next year.
-Easy but not obvious how to get a version of the GO. Needs to be easier if we are asking users to do this. -Midori-users are not asking for how to find ontology vers. # -David sees requests for how to find out what version was used in past studies
11AM: community interactions and outreach -Val-pombe community annotation: Pombe in google->community curation->see papers -How get 19/20 respondants? -Val: will send out form used to elicit responses, estimates 1/3 the time to complete the curation. Form sent to last author. Results are processed by Val. Results generate questions etc..so much work follows on respones, but does get info direct from authors. -not traced to who did the annotation though. -facilitates user understanding of the existing annotations. see submission form here: http://www.aspb.org/publications/tairsubmission.cfm
Tanya: Plant Phys submission form improved with autocomplete to help submitters pick GO terms. Started with second journal (plant journal) using template spreadsheet. PJ handles this as a supplemental data submission included with the paper when published.
Jane: recommendations from the GO review
-Judy: what is a function, what is a process...we should define these..what do we mean by these in GO? We need to be clear about what we are doing.
-Web stats: Seth Seth can set up access to Google Analytics if you send him your gmail account.
-Mike: moved machines at Stanford, usage stats very constant over past 12-24 months..what does that mean?
Jane: GO News site available off geneontology web page < Chris-AmiGO status-IEAs included, but not GOA files...species excluded consequently, but IEAs for MODs included.
Judy-what are the questions that wet bench biologists want to ask of GO...and are we making that possible for them?
Ruth-labs use Inginuity because they have large number of annotators get more complete results. They use their own interface, GO+their own ontologies, own annotations. How many groups pay for access to inginuity? Is that money well spent?
David-examine inginuity..why are users using it? What can we learn from that?
Jen/Varsha-incorrect annotation is offputting, so is incomplete annotation
What is the issue? --interface (ala inginuity..inginuity provides a distinct service than GO) --Do we need to distinguish GO from inginuity, when to use which, for users? --lack of completeness of annotation or ontology? --incorrect annotation or ontology structure --user perception or understanding of the GO (dopamine is not always a neurotransmitter for example) --why are biologists not using GO? If not using GO what alternative? --Users using GO tools may not keep their ontologies up to date which makes GO seem less useful.
-Mike-we need to be creative about what GO provides. -Judy-We might want to lower the visibility of GO Tools and increase the visiblity of using GO from the GO site and point MODS to GO site, -Paul-if we are too far removed from end users it is hard to value our contribution and hard to get feedback from users directly
Tanya-target surveys to MOD users list, but also meeting registrants who may or may not be MOD users. Ask if they use GO? Provide a web page for survey so all users can fill out the same form no matter which meeting they are at?
ACTION ITEM: Build survey for grant update (val was nominated with Jane and Jen), determine where to send it. (Mike has subscription to SurveyMonkey)
-Midori-EBI targets biologists...ask them how they do it! -send reminder emails to survey recipients to try to reduce self selection of survey responders.
-ACTION ITEM:-provide two survey urls and compare responses from <GO help list with random list of biologists
Pascale: progress report
-David-do we have a gold standard set of genes we use to test annotation metrics? Not currently..but we should find a set for this purpose. Will standardize testing of annotation progress.
Brenley: -GONUTS should use UniProt IDs -Using UniProt protein names may not be that useful -Judy: there is a proposal for Human/Mouse/Rat isofom nomenclature
-Chris: annotate to gene, col. 17 will contain spec. isoform reference
Suzi: PAINT demo -need java 1.5 to support drag&drop. -PAINT uses the GOlite DB, updated ~weekly -Update GAF files no less frequently than once per month -transferring annotations only can happen when manual annotation is complete by all groups for a given gene Paul-ACTION-locate code for making font size changes in PAINT -name column in table view should say “symbol” perhaps? -Uniprot IDs should be indicated as ‘Reviewed” or “unreviewed” (UniProt vs. Trembl) -Judy-Make lines in table for genes that have manual annotation more prominent
-David-would like to see paralogs-ie filter/highlight by species
-David-We should be sure to have a common mechanism for feeding GAF files back to MODs, could be used for PAINT annotations, GONUTS, and inferred annotations from Function/process cross references. -Positive annotations should not be propagated to clades that have NOT qualified annotations to related terms. -As PAINT annotation procedes, propagations that can be automated will be within PAINT to streamline PAINT curation process -GAF files from PAINT are available from wiki and Ref Genome page
Paul: propagation of annotations depends upon our ability to interpret annotations and the context in which that annotation was made. Some of those details lay in col. 16 like process X occurs in cell CL:#### at stage Y.
Suzi-if PAINT illustrates problems with protein/gene records, like a likely gene merge neded, how can that be fed back to sources to get the problem adjusted?
-Consider infrastructure to make more conjunctive statements-function X occurs in component A of cell Y for example.
-Consider choosing target genes based on functional systems
-Rex/Judy: all time taken away from literature curation takes away some experimental annotations needed to support inferential annotations.
Paul: The PAINT inference curation process is not highly scaleable. Re-reading of papers is sometimes needed to facilitate proper tree curation...can that be mitigated in some way?
Judy: Efficiency of tree curation is important, propagated annotations will be examined by MOD curators which will then address some of the complexity that Paul was bringing up.
Rex: bottom line...we (all of us) don’t have enough funding to curate all the necessary literature.
Judy: We already set the bar lower from “complete” annotation to “comprehensive” to help address the volume issues.
Judy: we have a lot of genes named that users expect to be annotated. Proper experiments to demonstrate the functions won’t ever be done in mouse for example if they have been done in cerevesie. These annotations are needing to be transfered..users expect them to be there...basic functionality like spliceosome.
Pascal’s proposed way to stop using the Google spreadsheets for Ref Gen. annotation tracking was generally accepted as worth further exploration. Compare relative dates of ISS annotations propagated from PAINT with most recent date for experimental annotation from GAFs.
Rex: how will status of ‘comprehensive’ annotation be captured with Pascale’s proposal?
Pascale: May not need concept of ‘comprehensive’..why not just go with ‘here is what could be propagated as of this date’....
Pascale: in PAINT, experimental annotations should be bolded from ANY source..not just Ref. genomes.