2018 Montreal GOC Meeting Agenda: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
 
(27 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=Montreal meeting Topics=
=Agenda moved to https://docs.google.com/document/d/16KMJRn-Vqf4yRsXAR1KYlwYvPob1tRTQbq2cEs6IJNY/edit#heading=h.uxqb7a8gufke=


=Intro=
 
 
 
 
-----
==This below is out of date==
 
=Wednesday Morning=
Paul
Paul
Project status, main accomplishments and discussion topics for the meeting
Project status, main accomplishments and discussion topics for the meeting
Line 10: Line 17:
* https://github.com/orgs/geneontology/projects/15
* https://github.com/orgs/geneontology/projects/15
* http://wiki.geneontology.org/index.php/Release_Pipeline
* http://wiki.geneontology.org/index.php/Release_Pipeline
* include PAINT GAF file generation QC (Dustin) https://github.com/orgs/geneontology/projects/23
* Report monthly PAINT updates and GAF file generation QC (Huaiyu/Dustin) https://github.com/orgs/geneontology/projects/23
* Plan for migrating to GPAD (Kimberly)
* Plan for migrating to GPAD (Kimberly)
** Reports from each group
** Reports from each group
Line 20: Line 27:


* Fate of external2go mappings (Chris): https://github.com/geneontology/go-docs/pull/27#issuecomment-426311327
* Fate of external2go mappings (Chris): https://github.com/geneontology/go-docs/pull/27#issuecomment-426311327
==Differences between old releases and new releases==
Seth ?
* Now releasing as a bundle
* Implications for various groups- including other distributors of GO data (QuickGO, NCBI)
* Improves (or just enables!) reproducibility


==GO rules update and error reports==
==GO rules update and error reports==
Line 58: Line 71:
* future plans
* future plans


==GO subsets update==
==Annotation==
Pascale:
===Gene Product to terms relations===
* Deprecated a number of unused/unmaintained subsets
https://github.com/orgs/geneontology/projects/13 (Kimberley, Dustin)
* Show subset yaml files and how they are used
* Each subset needs a maintainer
 
==PAINT tickets==
Marc & Pascale
https://github.com/geneontology/go-annotation/labels/PAINT%20annotation


==Annotation==
* Gene Product to terms relations: https://github.com/orgs/geneontology/projects/13 (Kimberley, Dustin)
Suggestions:  
Suggestions:  
* Resources for Annotation Review? Is reviewing a priority compared to creating new annotations?
* Resources for Annotation Review? Is reviewing a priority compared to creating new annotations?
Line 76: Line 81:
* Transcription reviews
* Transcription reviews
* Transcription guidelines: https://docs.google.com/presentation/d/1hOBMnpHbGyKsOJi-Ah3xhAavzOg2RgM4AXRdGbF8A6M/edit#slide=id.p1
* Transcription guidelines: https://docs.google.com/presentation/d/1hOBMnpHbGyKsOJi-Ah3xhAavzOg2RgM4AXRdGbF8A6M/edit#slide=id.p1
===PAINT tickets===
Marc & Pascale
https://github.com/geneontology/go-annotation/labels/PAINT%20annotation


===QC group report===
===QC group report===
* Add topics
* Add topics (report on progress)
 
====How to blacklist papers====
====How to blacklist papers====
Pascale  
Pascale  
* List/form template of excluded papers or combinations of PMID/Gene/GO/comment, including retracted papers
* List/form template of excluded papers or combinations of PMID/Gene/GO/comment, including retracted papers
* Centralized paper tracking system
* Centralized paper tracking system


==Ontology==
==Ontology==
Line 91: Line 99:
* ECM update
* ECM update
* Stop creating and start merging precomposed terms: x involved in y, X protein binding, protein complex x binding etc
* Stop creating and start merging precomposed terms: x involved in y, X protein binding, protein complex x binding etc
** protein complex x binding: how does that impact 'colocalizes with protein complex x'?
* Pathways2GO: https://github.com/orgs/geneontology/projects/24
* Pathways2GO: https://github.com/orgs/geneontology/projects/24
* Ontology maintenance: https://github.com/orgs/geneontology/projects/14
* Ontology maintenance: https://github.com/orgs/geneontology/projects/14
Line 102: Line 111:
* Proposal to convert all protein binding terms from IPI and a target in the 'with' field to IDA and a has_input.
* Proposal to convert all protein binding terms from IPI and a target in the 'with' field to IDA and a has_input.


==Handling redundant information==
 
===GO subsets update===
Pascale:
* Deprecated a number of unused/unmaintained subsets
* Show subset yaml files and how they are used
* Each subset needs a maintainer
 
Val: proposed new generic slim (15 minutes)
 
==Defining redundant information==
(Chris)
(Chris)
* Define redundant information:  
* Define redundant information:  
Line 110: Line 128:
* '''AI: Rules for flagging redundant annotations need to be documented - Chris'''   
* '''AI: Rules for flagging redundant annotations need to be documented - Chris'''   


Discussion points: Is redundant, non-experimental annotation ever useful?
Use cases for 'redundant', non-experimental annotation
* Are there any use cases where people have used these annotations for some type of analysis?
* Are there any use cases where people have used these annotations for some type of analysis?
* Some pipelines (InterPro2GO, SPKW, PAINT, F-P links), sometimes provide data that is already captured experimentally, and some groups would like the redundancy reduced.  
* Some pipelines (InterPro2GO, SPKW, PAINT, F-P links), sometimes provide data that is already captured experimentally, and some groups would like the redundancy reduced.  
Line 118: Line 136:
*[https://github.com/geneontology/go-annotation/issues/2042#issuecomment-412964090 Doubled up IBA+EXP annotations (from Karen Christie)]
*[https://github.com/geneontology/go-annotation/issues/2042#issuecomment-412964090 Doubled up IBA+EXP annotations (from Karen Christie)]
* [https://github.com/geneontology/go-site/issues/767 Issue with GOC inference file (i) incorrect aspect reported]
* [https://github.com/geneontology/go-site/issues/767 Issue with GOC inference file (i) incorrect aspect reported]
What data to present to which users
Examples (different types of redundancy):
* TAS http://amigo.geneontology.org/amigo/gene_product/UniProtKB:P05177 (13 annotations to the same activity and 10 to the same component by TAS)
* Exact duplicates by GOC function-process pipeline
* Identical annotation by different groups (rare)
*'''Proposal:'''
*'''Proposal:'''


Line 142: Line 168:
How do changes in the reference proteome in-between releases affect GO? i.e  What happens to new or revised IDs if they are used in GO annotations, but are not represented in the reference proteome?
How do changes in the reference proteome in-between releases affect GO? i.e  What happens to new or revised IDs if they are used in GO annotations, but are not represented in the reference proteome?


* https://github.com/geneontology/helpdesk/issues/161
{{strike| https://github.com/geneontology/helpdesk/issues/161}} closed moved to https://github.com/geneontology/go-site/issues/819 (this ticket is just a general question about which dataset we use)
* https://github.com/geneontology/go-site/issues/756
* https://github.com/geneontology/go-site/issues/756
* https://github.com/geneontology/amigo/issues/475
* https://github.com/geneontology/amigo/issues/475 (vw I don' understand the issues in this ticket any more, so not included in summary)
* https://github.com/geneontology/helpdesk/issues/144
* https://github.com/geneontology/helpdesk/issues/144 CHTF8, included as example
* https://github.com/geneontology/go-annotation/issues/2082
* https://github.com/geneontology/go-annotation/issues/2082 SLC35A4, MOCS2 includedas examples
* https://github.com/geneontology/go-annotation/issues/2081
* https://github.com/geneontology/go-annotation/issues/2081 ignore this one
* https://github.com/geneontology/go-site/issues/816
* https://github.com/geneontology/go-site/issues/816


Line 174: Line 200:


=Breakout sessions topics=
=Breakout sessions topics=
#Specifications for next version of GPAD/GPI
# Guidelines for submitting annotations to GO - for example Ivan Erill also had an idea to ask the organizers of the Phage Meeting to provide an option for abstract submissions to include author-generated GO annotations. What would our guidelines be ?  
# Guidelines for submitting annotations to GO - for example Ivan Erill also had an idea to ask the organizers of the Phage Meeting to provide an option for abstract submissions to include author-generated GO annotations. What would our guidelines be ?  
# GO Slims - review Alliance slim with latest stats from Mary Dolan.  Does the goslim_agr need any updates?
# GO Slims - review Alliance slim with latest stats from Mary Dolan.  Does the goslim_agr need any updates?
Line 182: Line 209:
=Product owners/tech leads discussion - lunch Thursday=
=Product owners/tech leads discussion - lunch Thursday=
* Small debrief session: Lessons learned, suggestions for improvements, next face-2-face meeting
* Small debrief session: Lessons learned, suggestions for improvements, next face-2-face meeting
= Meeting Photo & Slideshow =
[https://drive.google.com/drive/folders/1iahjZ1h8CqA-O1U-OQGC9fIcvkdkpIV9?usp=sharing Photos]
[https://drive.google.com/file/d/1qPeAxxq11SZTiibvvQrUxdoZRm5nc4ii/view?usp=sharing Slideshow]


[[Category: GO Consortium Meetings]]
[[Category: GO Consortium Meetings]]

Latest revision as of 14:18, 19 October 2018

Agenda moved to https://docs.google.com/document/d/16KMJRn-Vqf4yRsXAR1KYlwYvPob1tRTQbq2cEs6IJNY/edit#heading=h.uxqb7a8gufke


This below is out of date

Wednesday Morning

Paul Project status, main accomplishments and discussion topics for the meeting

New Pipeline & new pipeline documentation

Kimberly Must be completed (as much as possible) and announced

Information points:

Differences between old releases and new releases

Seth ?

  • Now releasing as a bundle
  • Implications for various groups- including other distributors of GO data (QuickGO, NCBI)
  • Improves (or just enables!) reproducibility

GO rules update and error reports

Eric & Pascale

GO-CAM and Noctua

https://github.com/orgs/geneontology/projects/19

Noctua 1.1 release

Kimberly

  • Updates on progress towards the 1.1 release
    • Bug fixes
    • New features
    • Groundwork for future releases

Annotation review form

Treymane

GO-CAM integration with other pathway resources

Ben

  • Reactome

Roadmap for next Noctua release

Kimberly

  • Integrating all (more) genomes in GOC

AmiGO update

Seth https://github.com/orgs/geneontology/projects/21

GO website migration

Laurent-Philippe/Suzi A: https://github.com/orgs/geneontology/projects/22

Ontology and Annotation documentation update

David, Kimberly and Pascale random thoughts:

  • Ontology: mention the creation of 'projects' in GH where we moved old projects, so that its' easier to find old discussions
  • future plans

Annotation

Gene Product to terms relations

https://github.com/orgs/geneontology/projects/13 (Kimberley, Dustin)

Suggestions:

PAINT tickets

Marc & Pascale https://github.com/geneontology/go-annotation/labels/PAINT%20annotation

QC group report

  • Add topics (report on progress)

How to blacklist papers

Pascale

  • List/form template of excluded papers or combinations of PMID/Gene/GO/comment, including retracted papers
  • Centralized paper tracking system

Ontology

Suggestions:

Binding and Protein complexes

Kimberly

  • Should these be separate branches of GO ?

See (among other tickets) https://github.com/geneontology/go-annotation/issues/1940

  • Proposal to convert all protein binding terms from IPI and a target in the 'with' field to IDA and a has_input.


GO subsets update

Pascale:

  • Deprecated a number of unused/unmaintained subsets
  • Show subset yaml files and how they are used
  • Each subset needs a maintainer

Val: proposed new generic slim (15 minutes)

Defining redundant information

(Chris)

  • Define redundant information:

In AmiGO, we should be able to improve the display by removing redundant information. That information may be useful for certain purposes, so we should provide it in files. We could also provide the 'core set' ('stringent set') in some version of files. There are tickets that explains the strategy that will be taken for this: see https://github.com/geneontology/amigo/issues/43 and https://github.com/geneontology/amigo/issues/440

  • AI: Rules for flagging redundant annotations need to be documented - Chris

Use cases for 'redundant', non-experimental annotation

  • Are there any use cases where people have used these annotations for some type of analysis?
  • Some pipelines (InterPro2GO, SPKW, PAINT, F-P links), sometimes provide data that is already captured experimentally, and some groups would like the redundancy reduced.
  • Should all GOC members be handling redundancy in the same way?
  • If redundant, non-experimental annotations are present and are going to be removed, at what point in the pipeline should they be filtered, e.g. annotation file production by GOC, annotation file processing by MODs, website display?
  • If we filter annotations files, should we then also provide two annotation files for users, one complete and one filtered?
  • Doubled up IBA+EXP annotations (from Karen Christie)
  • Issue with GOC inference file (i) incorrect aspect reported

What data to present to which users

Examples (different types of redundancy):

  • Proposal:

New topics

Representing complete proteomes in GO (added by Val, as suggested by Chris)

Datasets (obtaining/maintaining complete datasets with unique identifiers)

Overview. Nobody seems really sure what happens. I'll document what I think happens here and then run it by others to confirm

1. GO uses https://www.ebi.ac.uk/reference_proteomes to define the set of human IDs uniquely. This is also used by Panther. This reference proteome set represents each HGNC ID uniquely. This causes issues when 2 proteins are encoded by the a loci described by a single HGNC name.

2. UniProt has other versions of reference proteome (I asked UniProt helpdesk about this).

3. What to do about READThroughs from Ensembl (MED17 as an example)

Questions

How do changes in the reference proteome in-between releases affect GO? i.e What happens to new or revised IDs if they are used in GO annotations, but are not represented in the reference proteome?

Template:Strike closed moved to https://github.com/geneontology/go-site/issues/819 (this ticket is just a general question about which dataset we use)

Annotation

  • Which organisms other than cerevisiae and pombe have looked at all protein coding genes for the availability/possibility of GO annotation? Establish the difference between:
  • (1) "not in the GO database (not found);
  • (2) "unknown" (ND),
  • (3) "unannotated" (no ND, and no annotation in Aspect of interest)

(difference can be established using the complete known protein ID set for your organism and GO term mapper https://go.princeton.edu/cgi-bin/GOTermMapper)

ChEBI ontology (as suggested by Gareth Owen)

1) Would the GOC be interested in simplifying the ChEBI Ontology?

Over the last year or so, we have been working on allowing individual users to 'personalise' the ChEBI Ontology, by choosing terms in the ontology that they don't wish to see (in effect, reducing the number of levels between the name of a molecule at one end and "chemical entity" at the other). This has to be done term by term, but we have also been working with the Ontology Lookup Service to enable this type of process to be automated.

2) There are a few GO-submitted issues in the ChEBI issue tracker on GitHub that could benefit from discussion (though I suspect that a discussion of a few people during a break may be more useful than involving everybody ia a separate part of the agenda?)

3) From a ChEBI point of view, a lot of effort goes in to building and maintaining the ChEBI ontology, but we would like to understand how the ChEBI ontology is then actually used (perhaps as part of GO) to tackle real problems. While we know that a number of ontologies (including GO!) make use of lots of ChEBI terms, we would be very interested in how the ontologies are actually being used in tools, applications, etc. in the 'real' world.

4) Give a broad perspective of ChEBI, how it links to other resources e.g. via UniChem and key applications e.g. in metabolomics.

Breakout sessions topics

  1. Specifications for next version of GPAD/GPI
  2.  Guidelines for submitting annotations to GO - for example Ivan Erill also had an idea to ask the organizers of the Phage Meeting to provide an option for abstract submissions to include author-generated GO annotations. What would our guidelines be ?
  3. GO Slims - review Alliance slim with latest stats from Mary Dolan. Does the goslim_agr need any updates?
  4. Modeling transcription in GO-CAMs
  5. 'Response to' workshop (similar to the signaling WS)
  6. Use cases: Should we add this to the agenda? What would be a productive way of discussing this topic? https://docs.google.com/document/d/104m4jUNjPH9pCpskg8E29Zm2pLHhPFmKyIFLa9l_EOQ/edit

Product owners/tech leads discussion - lunch Thursday

  • Small debrief session: Lessons learned, suggestions for improvements, next face-2-face meeting

Meeting Photo & Slideshow

Photos

Slideshow