2016 Los Angeles GOC Meeting Agenda: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(46 intermediate revisions by 9 users not shown)
Line 1: Line 1:
==[https://docs.google.com/document/d/1MAnnOfs-e2LY9MnqdCZscalbxbNUDSJ9pbMZ-f2WS9U/edit?usp=sharing Minutes]==
=Day 1=
=Day 1=
== 9am Welcome, schedule and logistics (Paul)==
* Remote attendees call in via Bluejeans: https://bluejeans.com/993661940
* Schedule
* Introductions
https://docs.google.com/document/d/1MAnnOfs-e2LY9MnqdCZscalbxbNUDSJ9pbMZ-f2WS9U/edit?usp=sharing
https://drive.google.com/drive/folders/0B8kRPmmvPJU3dFhhcWhTSmlUcDA
link to folder on Google drive:  bit.ly/geneont-drive


==Overview/Plan for Upcoming Five Years==
==Overview/Plan for Upcoming Five Years==
*GO PIs presentation
*GO PIs presentation (Paul T)
**Update on AGR
 
== Coffee break 10am-ish ==


== Informatics and Infrastructure 5 year plan ==
== Informatics and Infrastructure 5 year plan (Chris, LBNL) ==
=== Update on various changes (Chris) ===
=== Update on various changes ===


* replacement of svn
* replacement of svn
Line 21: Line 35:
=== Proposal ===
=== Proposal ===


* Switch to global monthly releases (cjm)
* Switch to global monthly releases, but still provide daily snapshots
** Still provide daily snapshots


== Lunch 11:45 - 1:00pm ==
== Lunch 11:45 - 1:00pm (on our own, must vacate room for seminar happening then) ==


==Updates==
===AmiGO and web site action items (Seth)===
*AI: Mechanism to remove redundant TAS/ISS/IEA etc annotations that are covered by experimental annotations
**See https://github.com/geneontology/amigo/issues/294
*AI: GOC to extract this data and display annual stats on web page
**See http://amigo.geneontology.org/amigo/base_statistics
**AI: Include stats for % genome annotated
*** https://github.com/geneontology/amigo/issues/348
**AI: Include GOC extract ontology stats
*** https://github.com/geneontology/go-site/issues/179
**IEA annotations should be broken out by reference (this is the method used), and converted to a provider, e.g. GO_REF:0000019 is Ensembl Compara.
*** TODO: Define in tracker (Done) https://github.com/geneontology/amigo/issues/399
*term matrix (Val)


==Updates==
===Update on UniProt GCRP sets (Maria)===
*AmiGO and web site action items (Seth)
===Update on gpi specifications and uses (Kimberly, Chris, 10-15 min)===
**AI: Mechanism to remove redundant TAS/ISS/IEA etc annotations that are covered by experimental annotations
* items?
***See https://github.com/geneontology/amigo/issues/294
===Ontology Group Update (DavidH)===
**AI: GOC to extract this data and display annual stats on web page
* Special Projects
***See http://amigo.geneontology.org/amigo/base_statistics
***AI: Include stats for % genome annotated
***AI: Include GOC extract ontology stats
***IEA annotations should be broken out by reference (this is the method used), and converted to a provider, e.g. GO_REF:0000019 is Ensembl Compara.
**AI: Val give presentation on term matrix
* Update on UniProt GCRP sets (Maria)
* Update on gpi specifications and uses (please list specific items) (Kimberly, Chris) - 10-15 minutes
*Ontology Group Update (DavidH)
*** Special Projects
** Cilia
** Cilia
*** Autophagy
** Autophagy
*** Apoptosis
** Apoptosis
*** Plant Enzymes
** Plant Enzymes
*** Synapse (DOS/PDT)
** Synapse (DOS/PDT)
** GO help report
* GO help report
===Alliance of Genome Resources Update (Judy, Paul S)===
 
== Coffee break 3pm ==


==Breakout session to finish proposal development==
==Breakout sessions to finalize development of various proposals ==
The following are concurrent
The following are concurrent
* Flowchart guidelines for transcription factor annotations [10 min Rachael/Ruth/Barbara] [http://wiki.geneontology.org/index.php/File:Expression_Transcription_Decision_Tree_2016.pptx presentation]. To improve consistency UCL team have created an annotation flowchart which is being circulated to GOC members.
* Flowchart guidelines for transcription factor annotations [10 min Rachael/Ruth/Barbara] [http://wiki.geneontology.org/index.php/File:Expression_Transcription_Decision_Tree_2016.pptx presentation]. To improve consistency UCL team have created an annotation flowchart which is being circulated to GOC members.
* PI UniProt discussion
*   PI +UniProt discussion
*GAF/GPAD inference from LEGO models (Jim and David OS, Chris, Kimberly, David Hill, ...)
** Introduction to inferring annotations from LEGO:  Extended Gene Product to GO term relations; Reasoning across causal chains.
***Jim Balhoff:  Inference using Blazegraph & RDFox
***DOS: Templates, design patterns and inference.
* Community annotation web presence (Val, Ruth)
* Community annotation web presence (Val, Ruth)
==Reports from breakout groups==


=Day 2=
=Day 2=


==Annotation Issues - LEGO Annotations==
==Start time 9am==
===Aligning Conventional and LEGO Annotations===
 
*A proposal to make Conventional Annotation align better with LEGO modelling (F-P linking) (Val)
 
https://github.com/geneontology/go-ontology/issues/12739#issuecomment-254623691
 
===Evidence codes in Noctua===
*How are we going to handle ECO codes in Noctua. Currently there are only a limited number of codes that fall under 'used in manual assertion'. If we use codes that are not specific to the manual assertion part of the ontology, then they map to EXP. Are we going to request the entire set of codes that we think we might want to use or are we going to have an automated way to map to the correct code?


Example: http://noctua.berkeleybop.org/editor/graph/gomodel:5745387b00001874
==Hear report of conclusions and actions from remaining breakout groups==
 
# Suggestions to prioritize Ontology refactoring 3:00 PM to 3:45
===Generating conventional annotations from Noctua models===
# Combining ontology design patterns and LEGO templates
*MGI's experience roundtripping with Noctua Models (DavidH)
# Seth break out session that was lovely
*Are we going to allow Noctua to generate conventional annotations to the root nodes of the ontology?
# Flowchart guidelines for transcription factor annotations
**This would be useful for contextual annotations that are to otherwise root nodes.
**However some groups block these kinds of annotations because in the past, these annotations were used to keep track of genes about which we had no information.  
**Note that the evidence code for a root node annotation in Noctua would/could be different in that the curator might assert that a gene product has some molecular function due to the observation that, when mutated, there is a phenotypic outcome, e.g. apoptosis execution fails.
**This is a different statement from no biological data (ND) in which there is no information at all to assert a role in any biological process.
*Are some conventional annotation rules inappropriate for Noctua annotation?
**For a molecular function occurring in a cellular location, isn't IEP a more appropriate evidence code? IDA would mean that the function was assayed in situ.    https://github.com/geneontology/go-annotation/issues/1395
**Since binding is a part of many molecular functions, should we allow evidence codes other than IPI for binding (eg TAS)?
 
== Regulation relations ==
 
Regulation and causal relations are central to LEGO annotation and to inference based on LEGO models, but definitions and guidelines still need work to ensure consistency and clarity. 
DOS: I would like to present progress on the development of the relevant relations along with a proposal for how to improve them.  This would probably work best as a collaborative presentation with LEGO annotators where we can show application to LEGO models.


==Annotation Issues - Conventional Annotations==
==Annotation Issues - Conventional Annotations==
Line 132: Line 126:


  https://github.com/geneontology/go-ontology/issues/12701
  https://github.com/geneontology/go-ontology/issues/12701
===Generating conventional annotations from Noctua models===
* GAF/GPAD inference from LEGO models (Jim and David OS, Chris, Kimberly, David Hill, ...)
** Introduction to inferring annotations from LEGO:  Extended Gene Product to GO term relations; Reasoning across causal chains.
***Jim Balhoff:  Inference using Blazegraph & RDFox
***DOS: Templates, design patterns and inference.
*MGI's experience roundtripping with Noctua Models (DavidH)
*Are we going to allow Noctua to generate conventional annotations to the root nodes of the ontology?
**This would be useful for contextual annotations that are to otherwise root nodes.
**However some groups block these kinds of annotations because in the past, these annotations were used to keep track of genes about which we had no information. 
**Note that the evidence code for a root node annotation in Noctua would/could be different in that the curator might assert that a gene product has some molecular function due to the observation that, when mutated, there is a phenotypic outcome, e.g. apoptosis execution fails.
**This is a different statement from no biological data (ND) in which there is no information at all to assert a role in any biological process.
*Are some conventional annotation rules inappropriate for Noctua annotation?
**For a molecular function occurring in a cellular location, isn't IEP a more appropriate evidence code? IDA would mean that the function was assayed in situ.    https://github.com/geneontology/go-annotation/issues/1395
**Since binding is a part of many molecular functions, should we allow evidence codes other than IPI for binding (eg TAS)?


===Annotations from High Throughput Experiments===
===Annotations from High Throughput Experiments===


*Annotations from high-throughput experiments (Pascale, Ruth, David Hill, Kimberly)
*Annotations from high-throughput experiments (Pascale, Ruth, David Hill, Kimberly)
*AI: discuss proposal for defining HTP data (Pascale, 20 min)
*AI: discuss proposal for defining HTP data  
**How do we decide when to make annotations from high-throughput experiments?
**How do we decide when to make annotations from high-throughput experiments?
**If we decide that annotations from high-throughput experiments should be removed, what are the procedures (all annotations, some annotations)?
**If we decide that annotations from high-throughput experiments should be removed, what are the procedures (all annotations, some annotations)?
**Do we want new evidence codes to indicate that the annotation was inferred from a high-throughput experiment?
**Do we want new evidence codes to indicate that the annotation was inferred from a high-throughput experiment?


===Qualifiers that describe relationship of gene product to a biological process===
===Qualifiers that describe relationship of gene product (activity) to a biological process===
These are used in LEGO as well
 
** involved_in = causally_upstream_of_or_within (default?)
DOS;  Pascale.
** causally_upstream_of
 
** part_of
As a result of work on LEGO in conversion to GPAD/GAF, we now have a wider set of relations linking gene products to biological processes (GAF qualifiers/GPAD column 2).  These are derived from the broader set of causal relations developed for LEGO.  We need to discuss how these should be applied to conventional annotations.  This builds on a proposal to agreed in DC but not yet implemented to distinguish annotations where it is clear that a gene product activity is part of a process from cases where it is not clear if the activity is part of the process or causally upstream of it.
 
=Day 3=
==Start at 8:30am (note clocks moved back at 2am)==
 
==Annotation Issues - LEGO Annotations==
 
=== Regulation relations  - proposal to improve definitions (DOS) ===
 
Regulation and causal relations are central to LEGO annotation and to inference based on LEGO models, but definitions and guidelines still need work to ensure consistency and clarity. 
 
===Aligning Conventional and LEGO Annotations===
 
*A proposal to make Conventional Annotation align better with LEGO modelling (F-P linking) (Val)
 
https://github.com/geneontology/go-ontology/issues/12739#issuecomment-254623691
 
===Evidence codes in Noctua (David H)===
*How are we going to handle ECO codes in Noctua. Currently there are only a limited number of codes that fall under 'used in manual assertion'. If we use codes that are not specific to the manual assertion part of the ontology, then they map to EXP. Are we going to request the entire set of codes that we think we might want to use or are we going to have an automated way to map to the correct code?
 
Example: http://noctua.berkeleybop.org/editor/graph/gomodel:5745387b00001874
 
==Conference Calls and Communication==
Estimated time: 30 minutes
*Discuss different options for reducing the number of conference calls, while still facilitating effective communication between the different GO groups, e.g. annotators, ontology editors, software team
**Consolidate all annotation calls (Monday LEGO, Tuesday Annotation, Tuesday PAINT) into one Tuesday annotation call, frequency TBD
**Consolidate LEGO, Annotation, PAINT, and Ontology Development calls into one weekly GO call
*Discussion on the design of new SOPs for mechanisms of communication
**What is the best mechanism to alert annotation groups of changes to the ontology that will affect annotations? We have started a table of contacts, but is this how annotation groups would like to proceed?
**Review of github repositories, what to record where, who is processing/clearing tickets, etc.
*Discussion on what it means to be a member of the Gene Ontology Consortium, not just the NHGRI grant.
**Agreed to standards, which ones?
* Decide where and when to hold next GOC meetings
** Spring 2017
** Fall 2017 (possibly with SAB meeting)
 
== Wrap up, action items ==
# Final Summary of Initiatives we will be working on to prepare for next meeting
#* Paul T


== Annotation Metrics ==
== Time permitting, or post-meeting breakout ==
 
=== Annotation Metrics ===
Estimated time: 1 hour
Estimated time: 1 hour
*What are the optimal metrics to assess progress in GO annotation?
*What are the optimal metrics to assess progress in GO annotation?
Line 162: Line 211:
*Multiple funding bodies (Ruth)
*Multiple funding bodies (Ruth)
*Distinguishing annotations that are created automatically, e.g. inference pipelines (Tony)
*Distinguishing annotations that are created automatically, e.g. inference pipelines (Tony)
*Individual curators attribution via Orcid IDs, it is important to establish if this wanted, and if it is wanted at what level of information? At an annotation by annotation level or just as a summation of contribution.
*Recognizing individual curators contributions via Orcid IDs.  
 
**Determine at what level attribution occurs
==Conference Calls and Communication==
**Inclusion of funding source
Estimated time: 30 minutes
*Discuss different options for reducing the number of conference calls, while still facilitating effective communication between the different GO groups, e.g. annotators, ontology editors, software team
**Consolidate all annotation calls (Monday LEGO, Tuesday Annotation, Tuesday PAINT) into one Tuesday annotation call, frequency TBD
**Consolidate LEGO, Annotation, PAINT, and Ontology Development calls into one weekly GO call
*Discussion on the design of new SOPs for mechanisms of communication
**What is the best mechanism to alert annotation groups of changes to the ontology that will affect annotations? We have started a table of contacts, but is this how annotation groups would like to proceed?
**Review of github repositories, what to record where, who is processing/clearing tickets, etc.
*Discussion on what it means to be a member of the Gene Ontology Consortium, not just the NHGRI grant.
**Agreed to standards, which ones?
 


[[Category: GO Consortium Meetings]]
[[Category: GO Consortium Meetings]]

Latest revision as of 22:08, 6 November 2016

Minutes

Day 1

9am Welcome, schedule and logistics (Paul)

https://docs.google.com/document/d/1MAnnOfs-e2LY9MnqdCZscalbxbNUDSJ9pbMZ-f2WS9U/edit?usp=sharing

https://drive.google.com/drive/folders/0B8kRPmmvPJU3dFhhcWhTSmlUcDA

link to folder on Google drive: bit.ly/geneont-drive

Overview/Plan for Upcoming Five Years

  • GO PIs presentation (Paul T)

Coffee break 10am-ish

Informatics and Infrastructure 5 year plan (Chris, LBNL)

Update on various changes

Proposal

  • Switch to global monthly releases, but still provide daily snapshots

Lunch 11:45 - 1:00pm (on our own, must vacate room for seminar happening then)

Updates

AmiGO and web site action items (Seth)

Update on UniProt GCRP sets (Maria)

Update on gpi specifications and uses (Kimberly, Chris, 10-15 min)

  • items?

Ontology Group Update (DavidH)

  • Special Projects
    • Cilia
    • Autophagy
    • Apoptosis
    • Plant Enzymes
    • Synapse (DOS/PDT)
  • GO help report

Alliance of Genome Resources Update (Judy, Paul S)

Coffee break 3pm

Breakout sessions to finalize development of various proposals

The following are concurrent

  • Flowchart guidelines for transcription factor annotations [10 min Rachael/Ruth/Barbara] presentation. To improve consistency UCL team have created an annotation flowchart which is being circulated to GOC members.
  • PI +UniProt discussion
  • Community annotation web presence (Val, Ruth)

Day 2

Start time 9am

Hear report of conclusions and actions from remaining breakout groups

  1. Suggestions to prioritize Ontology refactoring 3:00 PM to 3:45
  2. Combining ontology design patterns and LEGO templates
  3. Seth break out session that was lovely
  4. Flowchart guidelines for transcription factor annotations

Annotation Issues - Conventional Annotations

Modified Protein Binding

  • Modified protein binding: GO terms & annotations are very inconsistent. (DavidH to present Paola's proposal)
    • Recent github issues:
glycoprotein binding: https://github.com/geneontology/go-ontology/issues/12580#issuecomment-240782020
ubiquitinated protein binding https://github.com/geneontology/go-ontology/issues/12582#issuecomment-240452320

Protein Family Terms in the Ontology

  • Protein families in terms (DavidH)
    • Currently the inclusion of protein family information in term names is leading to inconsistent annotation.
      • For now, the ontology editors have not been adding terms that specifically refer to protein families with the exception of signaling pathways. Should we make this a rule? If so, how will we capture the detail desired by annotators and how will we make this backward compatible?
https://github.com/geneontology/go-ontology/issues/12440

Multiple Evidences to Support an Inference

  • How are people capturing data where both direct assay AND protein motif/domains/sequence needs to be used by the curator to provide the annotation? [15 min Ruth, started by Rebecca] presentation A system needs to be in place to enable the more specific annotations to be created for orthologous proteins (which cannot be done across all species with the IC evidence code)
    • eg transmembrane domain used as evidence to create the annotation 'integral to membrane' with IEA evidence; immunofluorescence localises protein to 'plasma membrane' (annotated with IDA evidence), ideal annotation to be created 'integral to plasma membrane'
    • 3 obvious options (any others?)
      • new evidence code IDD 'inferred by direct assay AND protein domain(sequence/motif?)' (would probably also want IMD, IGD, IED)
        • Note that ECO has a combinatorial evidence code that could possibly be used as the parent for new GO combinatorial codes:
          • combinatorial evidence used in manual assertion - ECO:0000244
      • no new evidence code requires as this is implied by the 'inferred' aspect of the evidence code as well as 'author intent'
      • Create a GOC pipeline that creates the CC annotations based on the IDA annotation (eg plasma membrane) and the IEA information (eg integral to membrane) to create the more specific annotation (eg integral to plasma membrane).


Consistent Classification of Signaling Pathway Terms

  • Conventions for signalling pathway terms
    • Currently you can request signalling pathway terms along multiple axes of classification including:
      • signalling module (MAPK cascade, GTPase etc)
      • process regulated
      • target TF's
      • ligand /pheromone activating pathway
      • Process regulated
      • condition activating pathway (in response to hydrogen peroxide and other oxidants for oxidative stress pathway)

This results in almost infinite number of ways to describe some pathways

https://github.com/geneontology/go-ontology/issues/12701

Generating conventional annotations from Noctua models

  • GAF/GPAD inference from LEGO models (Jim and David OS, Chris, Kimberly, David Hill, ...)
    • Introduction to inferring annotations from LEGO: Extended Gene Product to GO term relations; Reasoning across causal chains.
      • Jim Balhoff: Inference using Blazegraph & RDFox
      • DOS: Templates, design patterns and inference.
  • MGI's experience roundtripping with Noctua Models (DavidH)
  • Are we going to allow Noctua to generate conventional annotations to the root nodes of the ontology?
    • This would be useful for contextual annotations that are to otherwise root nodes.
    • However some groups block these kinds of annotations because in the past, these annotations were used to keep track of genes about which we had no information.
    • Note that the evidence code for a root node annotation in Noctua would/could be different in that the curator might assert that a gene product has some molecular function due to the observation that, when mutated, there is a phenotypic outcome, e.g. apoptosis execution fails.
    • This is a different statement from no biological data (ND) in which there is no information at all to assert a role in any biological process.
  • Are some conventional annotation rules inappropriate for Noctua annotation?
    • For a molecular function occurring in a cellular location, isn't IEP a more appropriate evidence code? IDA would mean that the function was assayed in situ. https://github.com/geneontology/go-annotation/issues/1395
    • Since binding is a part of many molecular functions, should we allow evidence codes other than IPI for binding (eg TAS)?

Annotations from High Throughput Experiments

  • Annotations from high-throughput experiments (Pascale, Ruth, David Hill, Kimberly)
  • AI: discuss proposal for defining HTP data
    • How do we decide when to make annotations from high-throughput experiments?
    • If we decide that annotations from high-throughput experiments should be removed, what are the procedures (all annotations, some annotations)?
    • Do we want new evidence codes to indicate that the annotation was inferred from a high-throughput experiment?

Qualifiers that describe relationship of gene product (activity) to a biological process

DOS; Pascale.

As a result of work on LEGO in conversion to GPAD/GAF, we now have a wider set of relations linking gene products to biological processes (GAF qualifiers/GPAD column 2). These are derived from the broader set of causal relations developed for LEGO. We need to discuss how these should be applied to conventional annotations. This builds on a proposal to agreed in DC but not yet implemented to distinguish annotations where it is clear that a gene product activity is part of a process from cases where it is not clear if the activity is part of the process or causally upstream of it.

Day 3

Start at 8:30am (note clocks moved back at 2am)

Annotation Issues - LEGO Annotations

Regulation relations - proposal to improve definitions (DOS)

Regulation and causal relations are central to LEGO annotation and to inference based on LEGO models, but definitions and guidelines still need work to ensure consistency and clarity.

Aligning Conventional and LEGO Annotations

  • A proposal to make Conventional Annotation align better with LEGO modelling (F-P linking) (Val)
https://github.com/geneontology/go-ontology/issues/12739#issuecomment-254623691

Evidence codes in Noctua (David H)

  • How are we going to handle ECO codes in Noctua. Currently there are only a limited number of codes that fall under 'used in manual assertion'. If we use codes that are not specific to the manual assertion part of the ontology, then they map to EXP. Are we going to request the entire set of codes that we think we might want to use or are we going to have an automated way to map to the correct code?

Example: http://noctua.berkeleybop.org/editor/graph/gomodel:5745387b00001874

Conference Calls and Communication

Estimated time: 30 minutes

  • Discuss different options for reducing the number of conference calls, while still facilitating effective communication between the different GO groups, e.g. annotators, ontology editors, software team
    • Consolidate all annotation calls (Monday LEGO, Tuesday Annotation, Tuesday PAINT) into one Tuesday annotation call, frequency TBD
    • Consolidate LEGO, Annotation, PAINT, and Ontology Development calls into one weekly GO call
  • Discussion on the design of new SOPs for mechanisms of communication
    • What is the best mechanism to alert annotation groups of changes to the ontology that will affect annotations? We have started a table of contacts, but is this how annotation groups would like to proceed?
    • Review of github repositories, what to record where, who is processing/clearing tickets, etc.
  • Discussion on what it means to be a member of the Gene Ontology Consortium, not just the NHGRI grant.
    • Agreed to standards, which ones?
  • Decide where and when to hold next GOC meetings
    • Spring 2017
    • Fall 2017 (possibly with SAB meeting)

Wrap up, action items

  1. Final Summary of Initiatives we will be working on to prepare for next meeting
    • Paul T

Time permitting, or post-meeting breakout

Annotation Metrics

Estimated time: 1 hour

  • What are the optimal metrics to assess progress in GO annotation?
    • Number of annotations
    • Number of references
      • Recall ZFIN's 'paper complexity' measure as a way of normalizing for different paper content (Doug mentioned in Geneva)
    • Revised annotations, e.g. updating to a new term
    • Removing annotations, e.g. improving knowledge about how a gene product affects a downstream process
    • Adding appropriate contextual information to existing annotations
    • Percentage of genome annotated vs percentage of genome with annotatable information?
  • How does LEGO modeling change our assessment of a curator's contributions?
  • Multiple funding bodies (Ruth)
  • Distinguishing annotations that are created automatically, e.g. inference pipelines (Tony)
  • Recognizing individual curators contributions via Orcid IDs.
    • Determine at what level attribution occurs
    • Inclusion of funding source