Oregon GO Consortium Meeting: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 254: Line 254:


# Ongoing quality control using reasoner-generated reports. (10 mins, Chris, '''David''', Tanya) [http://wiki.geneontology.org/images/f/f3/QC_update.ppt slides]
# Ongoing quality control using reasoner-generated reports. (10 mins, Chris, '''David''', Tanya) [http://wiki.geneontology.org/images/f/f3/QC_update.ppt slides]
*We're continuing to use the [[Ontology_Quality_Control|ontology QC]] procedures we reported in [[Ontology_Development_progress_report_for_October_2008_meetings#Ontology_Quality_Control|October]]. These include:
** Fixed tissue, organ, to match the definitions of the anatomical dictionaries
*[[Ontology_Quality_Control#Regulation_related_reports|Consistency checks]] for new regulation terms and regulates relationships;
** OBO-Edit 2.0 allows to fix missing links automatically (or fix errors in the ontology)
** David and Tanya looked at what types of errors occured: they looked at 52 errors, and found they fit in 6 categories:
like forgot to add parents or relationships; in 5 cases the ontology was wrong and got corrected.
** Note that this error rate is pretty low.
*A report on terms with multiple part_of parents;
*[[:Category:Cross Products|Cross-product]] reports that can be used for QC; also see [[XP:Progress_2008|Progress_2008]].
Progress is summarized on the [[Ontology_QC_Metrics]] page.
# Lung development (15 mins, '''David''') [http://wiki.geneontology.org/images/1/1f/Lung.ppt slides]
# Lung development (15 mins, '''David''') [http://wiki.geneontology.org/images/1/1f/Lung.ppt slides]
*A content meeting on [[Lung_Development|lung development]] was held in December 2007.
*A content meeting on [[Lung_Development|lung development]] was held in December 2007.

Revision as of 14:48, 31 March 2009

In the Fir Room, Erb Memorial Union.

Return to Consortium_Meetings

Dates

March 30, 31

Venue

University of Oregon, Eugene, Oregon

Erb Memorial Union (EMU), Fir Room.

Map of campus, http://map.uoregon.edu/

Map of the EMU, http://emu.uoregon.edu/map_popup.php?x=-50&y=-50&level=main&arrow=SE

Logistics

2009 Oregon Meeting Logistics

Please add yourself to the list of registrants on the Meeting Logistics Page

Agenda

ALERT: GO Top needs to sign off on agenda prior to these meetings. If there is no action item, or discussion point proposed in advance, the meeting is not the forum for first initiating a discussion.

Action Items

Minutes from the previous meeting (Montreal)


MONDAY March 30

8:30-9:00
(4:30- 5:00pm BST)

Coffee, Introductions

9:00-10:30
(5:00- 6:30pm BST)

OBO-Edit2.0 Tutorial and Demo (Chris, Amina, OBO-Edit Working Group)

  • Working with Cross-products in OBO-Edit (Chris)
    • Finding xp terms, searching, recognizing, constructing and manipulating xps
    • Using the xp panel in the Text Editor
    • Demo using the "internal" xps to be added to the live GO
  • OBO-Edit Working Group Meeting

10:30-10:45
(6:30- 6:45pm BST)

Coffee Break

10:45-12:00
(6:45- 8:00pm BST)

Working sessions: 1) David Hill, Tanya and Amina 2) OBO-Edit Working Group Members

12:00-1:30
(8pm-9.30pm BST)

Lunch - Take a walk...Get food in the EMU or local restaurants


1:30-1:45
(9.30pm-9.45pm BST)

Reference Genomes Update

Brief report of Ref Genome Progress (Pascale)

Watching remote people: Tanya Berardini


1:45-2:00
(9.45pm-10pm BST)

GO Clinics Update (Jane -remote-)

slides

Watching remote people: Tanya Berardini

NOTES:

  1. Clinics will be booked when there are interested parties to deal with SF items
  2. Only SF with parties interested in resolving the SF items go to clinics.
  3. Determine how many SF items have no comments.
  4. Old items that are no longer of interest can be closed.

ACTION ITEMS:

None


2:00-2:30
(10pm-10.30pm BST)

Metrics Metrics (Judy)

Watching remote people: Tanya Berardini

  • Action Item: What is appropriate metric for comprehensive annotation?

We need to be able to define project points for funding purposes. We need to document progress and justify priorities.

we need to keep appropriate metrics of our progress and to define how we set priorities. Some possible measures across the board that we could include in a monthly table might include

  1. Number of Protein Coding Genes
  2. Number of Functional Genes
  3. Total number of Genes
  4. Number of Genes with Comprehensive Annotations
  5. Number of Genes with only IEA in all domains
  6. Number of Genes with Experimental Data for BP
  7. Number of Genes with Experimental Data in MF
  8. Number of Genes with Experimental Data in CC
  9. Date of last update of annotations


  • Annotation Priorities
  1. Annotation of Reference Genome Genes
  2. Annotation of Genes with uncurated experimental literature
  • Ontology Development
    1. Keeping track of dependencies (need new terms to do annotations, etc)
    2. Knowing where we need to devote ontology development efforts
    3. Measuring progress...when have we saturated a branch, at least for the time being


All of these are for reporting purposes.


David: from a practical point of view, we're done when no one wants new terms... 'how many new terms relative to the number of annotations... This is one way to measure whether we are done

Judy: Funding sources need us to define what our tasks are, how we will work on them, and how we will know when we have completed the scoped out work.

  • Progress

David: The number of annotations vs. the number of new terms requested. When new terms go down and new annotations keep being added..ontology development is nearing completion...


NOTES:

  1. Progress reports link in left navigation column on wiki
  2. Sometimes work waits while other work is in progress...need to track work dependencies better.
  3. Pass SF item to responsible person and mark as 'pending' to track dependencies?

ACTION ITEMS:

  1. [all groups] update progress reports at meeting times and grant renewals
  2. [Judy?] template for desired reports can be found on wiki
  3. [Chris] made plea for better use of wiki categories..perhaps a webex session could be used to briefly teach us about that?



2:30-2:45
(10.30pm-10:45pm BST)

Coffee Break


3:00-4:30
(11pm-12.30pm BST)

'binding' issues

Watching remote people: Stan Laulederkind

  1. Binding term documentation (from Debby)
  2. using 'binding' for annotation
  3. should it be used in cross product annotations (Jim) (see Annotation_Cross_Products#binding_example)
  4. transfer of 'binding' term annotations via ISS/ISO??

NOTES:

  1. [Debbie] documentation is confusing on the proper use of binding
  2. [Val] need to be able to annotate to 'ATP binding'
  3. [Peter?] binding of X resulting in an allosteric change to the thing doing the binding is different than binding resulting in the calytic change of the bound molecule to a different product....chemically transformative binding vs. non-transformative binding. It is the later that we should be capturing.
  4. [David] Should GO be capturing binding of enzymatic substrates? Thinks not.
  5. How will limiting 'binding' annotations to non-catalytic interactions affect queries for genes involved in 'ATP binding' for example...researchers might reasonably expect to get back kinases by such a query?
  6. [Mike] Enzymes work in both directions..if a kinase binds ATP, does it also necessarily bind ADP?

ACTION ITEMS:

  1. [Peter (lead), Ruth, Debbie, Jim] Form a workgroup to examine the issues raised in the discussion. Should GO capture catalytic binding?

Column 16 (Chris)

Watching remote people: Amina Abdulla

Can we have a progress report and a target date to work towards? Column 16 will contain cross references to other ontologies that can be used to qualify the particular annotation. Harold's draft user doc for column 16/17 usage: File:Columns 16 and 17.doc

Official documentation: Annotation_Cross_Products

NOTES:

  1. [Chris]if the underlying biology (mechanism or genes involved) is different then new precomposed terms are preferred. Discussion ensued on how far to take that. What if a process is executed in a different location in one species vs. another but mechanistically is the same? What if a the genes involved are different in one organelle vs. another in the same cell?
  2. [Judy] We should have the terms needed to describe the biology.
  3. [David] if there is a need to put 2 IDs in column 16 from the same ontology, then that ontology has a problem that needs to be addressed
  4. [Chris] Use of 'NOT' is currently banned from column 16 to avoid possible further confusion. Can be revisited once in use more.

ACTION ITEMS:

  1. [All] By the end of April send Chris GAF files with column 16 data populated. Use GAF format 2.0 in the header of GAF files with column 16 or 17 data in them. Once Chris has examples a workgroup will assemble to examine the usage and flesh out inconsistencies, misuses, etc.

Column 17 (Chris)

Watching remote people: Stan Laulederkind

Can we arrange a date to release this? Column 17 is designed to allow GO annotations to specific isoform variants that may be encoded by a specific gene due to differential splicing or alternative translational starts. Harold's draft user doc for column 16/17 usage: File:Columns 16 and 17.doc Official documentation: GAF_Col17_GeneProducts

NOTES:

  1. Using 'gene_product' rather than 'gene' in column 12 is a radical change (GAF 2.0 format). Mike's GAF checking script could process use of 'gene_product' to 'gene' as we make a transition?

ACTION ITEMS:

  1. [developers] Define a constrained set of strings derived from SO that can be used in GAF column 12
  2. [Mike] Update GAF filtering script to change 'gene_product' to 'gene' in column 12 until final change date is set.
  3. [All] By the end of April, send test GAF data to Chris with column 17 populated. Use GAF format 2.0 in the GAF header for such a file.

SO update

Watching remote people: Donghui Li

Progress report on Sequence Ontology (Karen E) [slides]

4:30-5:00
(12.30am-1am BST)

PAINT demo

NOTES:

  1. Nice progress since Montreal meeting

ACTION ITEMS:

  1. [Suzi, Paul] Remove use of ND from annotation of ancestral nodes

7:00PM

Dinner at Mekalas

TUESDAY March 31

8:30-10:30
(4.30pm-6.30pm BST)

Review Monday's Notes and Action Items

Ontology Content Development

Organization/Management

1. GO_Timeline (10 mins, Midori,Ontology development team) 1.1

  • started publishing a file call go_ext (GO-extended) (contains the MF-MF intraontology links and the BP-MF interontology links)
  • put in place process for conversion of files
  • new pipeline:
go_write --cp--> go_ext  --filter P/F, filter F/F --> obof1.2 --> obo2obo --> obof1.0

In other words. the new file (go_ext) will be filtered to remove Function/process links from the file, and a file of the obo format 1.0 will be generated

  • Comments:
    • Jane: thought that we'd switch to the new file. Won't people ignore it?
    • Chris: we can't force people to all change their algorithms
    • David: hopefully people will come to us and use the data we provide, and the tools we provide in order for them to use those correctly.

1.2. released of Inter-Ontology 'regulates' links

1.3 SF status update (5 mins, Midori) see progress report

Work in progress

Watching remote people: Susan Tweedie

  1. Ongoing quality control using reasoner-generated reports. (10 mins, Chris, David, Tanya) slides
  • We're continuing to use the ontology QC procedures we reported in October. These include:
    • Fixed tissue, organ, to match the definitions of the anatomical dictionaries
  • Consistency checks for new regulation terms and regulates relationships;
    • OBO-Edit 2.0 allows to fix missing links automatically (or fix errors in the ontology)
    • David and Tanya looked at what types of errors occured: they looked at 52 errors, and found they fit in 6 categories:

like forgot to add parents or relationships; in 5 cases the ontology was wrong and got corrected.

    • Note that this error rate is pretty low.
  • A report on terms with multiple part_of parents;
  • Cross-product reports that can be used for QC; also see Progress_2008.

Progress is summarized on the Ontology_QC_Metrics page.

  1. Lung development (15 mins, David) slides
  • A content meeting on lung development was held in December 2007.
    • Met with experts in Boston; who did not want anatomical terms in the lung development in the ontology. So David made two files, one with and one without anatomical structures.
    • Until last week, lung development only had one child, and Dimitry (MGI) has been requesting new terms; interestingly those include anatomical terms
    • This also leads to the expansion of the 'morphogenesis of a branching structure' branch of the GO
    • Chris asks whether the lung development is being analyzed from the mouse/mammalian point of view: David says, for lung development, yes, but for branching morphogenesis he is trying to capture different organs and species
    • Alex: makes the point that this is analogous to the mitochondrial/cytoplasmic translation issue: all those branching types of development are different to some level, so we might need to add much more terms to describe this?
    • David: we've asked this question since the beginning of GO
    • Judy: we need to provide the information that researchers need
  1. Biological Signaling (5 mins, Jen -remote-) slides

Watching remote people: Michelle Gwinn-Giglio

  1. Organization and Biogenesis (15 mins, David, Jane -remote-, Midori -remote-, Tanya) slides
  2. Internal cross-product implementation (10 mins, Chris, David, Jane-remote-, Jen-remote-, Midori-remote-, Tanya)
  • David & Tanya are doing regulation xps
  • Jen is doing BP x BP xps
  • Jane is doing regulation of multi-organism process
  • Jane is doing CC x CC xps
  1. Proposal to add process-specific function terms so that part_of links can be made between function and process (30 mins, Chris, David, Tanya) slides

presenter in bold

  • Rational: each MF is a part of at least some BP (a MF has_part in a BP)
  • Problem is that for a given MF, you cannot be sure which of several BP it is involved in
  • An option is to use column 16
  • Proposal from Chris, Tanya and David: use part of: start with the easy things:
  • kinase activity -> phosphorylation; transporters -> transport
  • a more complicated example is 'arginosuccinate synthase activity' involved in urea cycle and polyamine biosynthesis
    • Pascale: what if the paper doesn't show the process as well? Tanya: you use the parent
    • Michelle: thought we had discuss this and decided against it, first because of term explosion, but also the huge amount of variation in metabolic pathways
    • David: no, this was using has_part, which means that is always has part. Part_of does not require that
    • (back to Tanya)
    • Example 1: Stan (RGD) requested 'regulation of Neyu/ErbB-2 receptor activity' on SF: term = 'coregulator activity involved in epidermal growth factor receptor signaling pathway' ; (and its regulation child); the term requested by Stan would be a synonym
    • Example 2: caspase activity (which was obsoleted) could be called 'cysteine-type endopeptidase activity involved in apoptosis'
    • Example 3: transcription factor activity: could be 'DNA binding involved in transcriptional regulation'
    • Example 4: chaperone activity: could be 'protein binding involved in protein folding'
  • Advantages of this approach:
    • 'protein binding involved in protein folding' is a protein_binding (F) and part_of 'protein binding' (P): makes it easier to count annotations and get inferences, and makes it easier for annotators to capture all information
    • Having those more granular functions (involving processes) allows one to test what potential processes a function might be involved in
  • Comments/questions:
    • Rama: what would the user see?
    • Does this mean you dont annotate to the process (or component) annotations specifically?
      • David: no, you dont.
    • Jen: can you still mine those automatically with the other pathway resources?
    • David, Chris: there is a problem with visualization, but that has been the case for a while
    • Michelle: are those terms F or P? Answer: F, because they has is_a parent to F, not process
    • Michelle: why would do column 16 annotations (to do two two GO terms)? David: to save time; if those are used a lot we can create a new term
    • Emily: similar point about column 16; also wondering about the evidence that could actually provide enough confidence for the annotation to one of those terms
    • Ruth: this will not get rid of the need to IC
    • Alex: if it's not reported in the paper, you dont annotate it
    • Ranjana: what happens to data miners?
    • Chris: if people do do use the graph, then it's too bad (Judy, David, others agree)
    • Paul: you have to wonder if this is a hack? why dont we use external resources that already have this information (ie, the different pathway tools? )
    • Chris: not quite sure how Paul's proposal is practically implemented; also this approach makes it more simple to integrate pathways
    • David: if GO wants to represents is pathways, then he agrees it's a hack; but we're not. We're trying to annotate what gene products do.
    • Peter: we need to treat pathways databases and other external resources as we treat publications (ie, a reference for the link)
    • Tanya: To answer Rama and Ranjana: people need to use information from the graph anyway
    • Michelle: The 3 ontologies have been separated for the past 10 years; we need to think about what users know and expect. This is now very different conceptually
    • Judy: WRT Process parentage in MF graph: it's the part_of that goes across graphs. We have always understood that MF and BP were part of the same. We need to provide ways for users to get to all this information, and this is a step in this direction. This is just how it evolves.
    • Chris: agrees with Michelle, we need to educate people
    • Kimberly: likes the links; but one concern is that from the point of view of the annotators: what seems like errors of omission can be due to confusion when do you demonstrate function and process. We need to come up with some guidelines as to use those new terms
    • Jen: There are users (biologists) feel that it's weird that we dont have the links.
    • Rama: the issue is not to make the annotation
    • Susan: Agrees that users should use the graphs; all our web pages show the 3 aspects, (Karen C): agrees, also look at that information for checking annotations are complete
    • Chris: the Software group can write a tool that will filter out the redundant annotations and add back those Process annotations if required
    • Jen: to address the point of users being ready: we should discuss plans to educate, etc
    • David: WRT annotation: we shouldn't be slowing progress by thinking about what evidence code to use


10:30-11:00
(6.30pm-7pm BST) Break


11:00-12:00
(7pm-8pm BST)

Ontology development discussion topics

  1. Addition of localization specific process terms? (This was discussed on the GO list after it was raised in the last reference genome jamboree but hasn't been resolved.) (David will introduce topic) Watching remote people: Stan Laulederkind
  2. Chaperones: revisited many years later (Midori will introduce topic, Rama will moderate discussion) - see introductory slides; Chaperones wiki page; SF 2560932 Watching remote people: Lakshmi Pillai
  3. 'Response to stimulus' - does the definition need to be revised? Clarify when these terms should and should not be used for annotation; see introductory slides (also relevant to Reference Genomes) and SF 2094943 Watching remote people: Amina Abdulla
  4. The possibility of linking terms in the OBO file to discussions in the GO wiki. This could be done to ensure increased awareness of ontology/annotation discussion topics for all curators (and interested external users). This item follows on from the 'document communication' discussion at the last GO Consortium meeting [1] (Emily will introduce topic) slides

Watching remote people: Lakshmi Pillai

12:00-12:15
(8pm-8.15pm BST)

Next GOC meeting - Fall 2009

GOC September 23-24
SAB September 25
Both at Jesus College, Cambridge, UK


12:15-1:15
(8.15pm-9.15pm BST)

Group Photo???

Lunch - Catered at meeting room


1:00-2:30
(9pm-10.30pm BST)

Watching remote people: Peter D'Eustachio

OBO-Edit

  1. OBO-Edit2 Updates (Amina)
  2. SourceForge Bugs
  3. SourceForge Feature Requests

Web presence, Outreach and User Advocacy

Outreach and Advocacy

  1. Short report from Emily Dimmer of GOA on the progress in introducing Swiss Institute of Bioinformatics curators to GO annotation. (5 min)
  2. Report from Jane on Help Desk stats, and other user advocacy progress since the last consortium meeting including demo of new news feed by Seth. (20 min) slides

Resource Usage statistics (Mike)

(15 min)

AmiGO

Amigo update and report (25 min)

GONUTS

~20 minute overview of GONUTS (a Gene Ontology Normal Usage Tracking System.) Some things have changed in the last year, and many improvements have been made to facilitate using GO.

slides (PDF)


2:30-3:00
(10.30pm-11pm BST)

Break

General Annotation Issues

3:00-5:00
(11pm-1am BST)

Annotation and ontologies

  1. substrate binding...annotation or new relationship in ontology?-Pascale spoke with Jane, ontology dev. group was looking into it?
  2. protein binding...not transferred with ISS
  3. Karen: obsolete 'ATP-dependent proteolysis': Not closed yet, why?
We were just waiting out the notification period. Now closed. -midori (march 13)

Evidence Codes

Watching remote people: Lakshmi Pillai

  1. Should we IC from ISS annotations?
    1. documentation currently says IC from ISS ok
    2. this is not allowed in the requirements for PAINT (but we can change that if needed)

GO Database

User Support

GO Papers, Publications and Presentations

  1. review for Molecular Reproduction & Development about using GO to study development (David, Doug, Kimberly, Tanya)






Return to Consortium_Meetings