2017 Cambridge GOC Meeting Agenda: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(107 intermediate revisions by 9 users not shown)
Line 1: Line 1:
=GOC Meeting, Cambridge , October 2-4, 2017=
GOC Meeting, Cambridge , October 2-4, 2017


==Agenda==
=To all speakers=
Please add your slides to the google drive:
https://drive.google.com/drive/folders/0B7bEr6HANSlGSHY3c2JqNEs2ZUk
=Minutes=
https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit#heading=h.638rbn3dlj1z
=Monday 2nd October 9:00 a.m.=
==Welcome, overview, vision, introductions==
GO PIs
==GO handbook presentation==
===The Gene Ontology and the meaning of biological function (Paul T)===




===Monday 2nd October===
===Translating research data into Gene Ontology annotations (Pascale)===  
(includes a discussion of the issue: background knowledge & protein domains
https://github.com/geneontology/go-annotation/issues/1621)


===Gene Ontology - annotation extensions (Ruth)===


====Working group meetings====
=Coffee break 10:30=
* QC team brainstorming (Kimberly, Sylvain, Val, Pascale)
* Noctua (Kimberly, Seth, Chris, David H, Paul, Karen, ...)


==Ontology Issues and Updates==
===Update on MF refactoring (Pascale and Paul T)===


===Qualifiers/Relation issues (Kimberly and Chris)===


====Refresher on Qualifiers/Relations (Chris)====


====Annotation issues====
https://docs.google.com/presentation/d/1aSr1kguPM9im_SHWGVVDe-7T0PNyvJUnXhMZk54on6Y/edit?usp=sharing
* '''Report from signaling workshop''': David /Kimberly
* Signaling: First attempt at Annotation consistency 2.0 - Kimberly to report on the approach and the outcome.
Discussion points: Limited participation (self-selected to participate). One recommendation may be to ask all active curators to participate, even at some low level
https://github.com/orgs/geneontology/projects/8


=====Annotation guidance presentation=====
====Biological Process Relations: Use of Qualifiers in Legacy Annotations====
Pascale/Sylvain
+ move slides to Google drive
 
=====Representing biologists’ view of biology=====
Pascale /Ruth
author intent & protein domains; IDA v IC v New evidence code https://github.com/geneontology/go-annotation/issues/1621
30 minutes discussion about issues; hopefully action items can come out of the discussion
 
 
=====CC component annotation guidelines=====
Kimberly
** CC component annotation: what does it mean ? Kimberly to do 1 proposal (out of 3 alternatives)
1. where the protein is active
2. two different meanings: enables or the right RO (part:of, ie just found there)
3. part_of (low information value!)
 
=====Contributes_to guidelines=====
Kimberly
 
=====HTP guidelines=====
Helen - 15 minutes
[https://docs.google.com/document/d/1ScIeclAzUXMe-tU6n0lVfsSwMHpOeNb7uK8On9-iKXc/edit?usp=sharing|HTP Guidelines draft]
Action Items Corvallis 2017/0
* add to guidance: contact authors if going to add - I think this should be done for all papers (Ruth)
* add to guidance: mark paper that is not annotable in Protein2GO (or your curation tool). Also consider contacting authors/journal to inform them that not going to be annotated. But if the authors have fully explained the problems then it is not appropriate to contact the authors/journals as they know the limits of the approach (Ruth).
* add to guidance: HTP IPI data should not be submitted to GO, instead send this data to IntAct. (Note:Done)
* change evidence codes for legacy HTP annotations to HDA, HMP etc.
 
=====Transcription annotations decision tree=====
Ruth
Action Items Corvallis 2017/06
* David OS to create transcription regulator activity (proposed by Paul T)
* ''Proposed changes to decision tree in Corvallis:
** ''Simplified from previous version. Essentially a choice between ‘regulating transcription by RNA polymerase II’ or ‘regulating gene expression’
** ''Annotation 5 = contributes_to sequence-specific DNA binding
''David: when people do enrichment, they don’t drop contributes_to (ie., pay attention to qualifiers), so all those proteins will come down as ‘DNA binding’
Action item: replace ‘annotation 5’ with ISS annotation (if DNA binding domain) or contributes to annotation 5 with ISS annotation if no DNA binding domain and domains to suggest coactivator
** ''Annotation 3 = nuclear chromatin
''Not everyone comfortable with this (ex., Stacia, David)
''Shouldn’t it be ‘colocalizes_with’?
''It was previously agreed that the definition for nuclear chromatin: The ordered and organized complex of DNA, protein, and sometimes RNA, that forms the chromosome in the nucleus. Source:PMID:20404130 was to be applied. The proteins here include all associated proteins, not limited to just histones. With this statement the TFs are then contributing to chromatin, rather than binding to chromatin or colocalizing with it.
** ''Decision tree to be put up on GOC website.
** ''Should annotation 4 exist? (Ruth)''
 
====Use of Qualifiers in Legacy Annotations====
Pascale
Proposal: We will apply the general qualifier to all legacy annotations. Each group can provide more specific qualifiers if they have a mechanism to distinguish.  
Proposal: We will apply the general qualifier to all legacy annotations. Each group can provide more specific qualifiers if they have a mechanism to distinguish.  
Action Items Corvallis 2017/06
====Regulates relations (Kimberly)====
* Working group to decide what relations should be available in Protein2GO or other tools for manual annotations.  Also decide when there are different ways of expressing the same thing, what way we will choose.  What should the default gene/gene product relation be for legacy annotations? 
Adding new qualifiers for the relation between a gene/gene product and a GO term.
* Chris to work on reports that may help curators make decisions about what annotations can get more expressive qualifiers.
*Corvallis discussion:*  What should the default relation be? How will we handle regulation? Use a relation, involved in regulation of, or use the precomposed regulation term?
 
Ruth: There are now 3 ways to say the same thing: - involved_in_regulation_of X - involved_in X regulation - involved_in BP regulates(X)
====Multiple qualifiers for an annotation====
Huaiyu
 
====Expanded list of relations====
Kimberly
 
====Regulates relations====
Kimberly
Adding new qualifiers for the relation between a gene/gene product and a GO term
What should the default relation be?
How will we handle regulation? Use a relation, involved in regulation of, or use the precomposed regulation term?  
 
Ruth: There are now 3 ways to say the same thing:
- involved_in_regulation_of X
- involved_in X regulation
- involved_in BP regulates(X)
       
For annotation purposes and for our users we want one.
For annotation purposes and for our users we want one.
DOS: Good point. These are semantically identical, but I agree we need to find a way to only have one: by convention for classic GO annotation and by filtering the output of inference for noctua output. Proposal: If a named regulation class exists: involved_in X regulation ...if not: involved_in {some BP} regulates(X) NOTE: If there was an annotation in Noctua such as ‘regulation of’ ‘very specific term’ and there was no term as ‘regulation of very specific term’ the GOC pipeline would create the annotation to the parent term: ‘regulation of less specific term’. Q: Is this implemented ?


DOS: Good point. These are semantically identical, but I agree we need to find a way to only have one: by convention for classic GO annotation and by filtering the output of inference for noctua output.  Proposal:
=Lunch 12:30=
If a named regulation class exists:
involved_in X regulation
...if not:
involved_in {some BP} regulates(X)
NOTE: If there was an annotation in Noctua such as ‘regulation of’ ‘very specific term’ and there was no term as ‘regulation of very specific term’ the GOC pipeline  would create the annotation to the parent term: ‘regulation of less specific term’.
'''Q: Is this implemented ?'''
 
====Proteoforms====
Establish a working group for when and how to use proteoforms in annotation. (Kimberly with Harold and Karen)
 
 
* Evening: Poster session


===Tuesday 3rd October===
===Molecular Function Relations===
====Multiple qualifiers for an annotation (Huaiyu)====


====Centralization of InterPro2GO annotations====
===Cellular Component Relations===
Proposal (follow-up from Geneva 2016): (Paul T)
*GO database pulls directly from InterPro2GO for UniProt Reference Proteomes
**MOD identifier is used as primary gene identifier
*Annotations are given "contributed by" InterPro
*MODs pull from GO database, no need to maintain separate InterPro pipelines


====GO_Slims====
====CC component annotation guidelines (Kimberly)====
Philosophies and methodologies (Mary, Suzi)
*CC component annotation: what does it mean ? Kimberly to do 1 proposal (out of 3 alternatives)
* Should GO support one or many? Creating a biologically useful slim (complete coverage by aspect, biologically useful terms i.e sufficient granularity, avoiding single step process terms (i.e functions), different slims for different purposes: Judy, Mary D (+ Suzi, Val, etc)
1. where the protein is active 2. two different meanings: enables or the right RO (part:of, ie just found there) 3. part_of (low information value!)
* GO ribbon
Presentation by Mary + 30 minutes discussion


===Wednesday 4th October===
*colocalizes with
AM Workshops


==Annotation guidelines and issues (part one)==
===Signaling (Kimberly and David)===
*Report from signaling workshop: David /Kimberly
*Signaling: First attempt at Annotation consistency 2.0 - Kimberly to report on the approach and the outcome.
Discussion points: Limited participation (self-selected to participate). One recommendation may be to ask all active curators to participate, even at some low level https://github.com/orgs/geneontology/projects/


*Priorities:*
==Ontology WG update: David==
https://docs.google.com/spreadsheets/d/1iX64t71c_xzjcv2OGoaExvz2Ls1IyldJynOp3MQTUI0/edit#gid=0


=Coffee break 15:30=
===Overview of GO annotations/Noctua (Kimberly & Chris)===
====Getting Noctua ready for production (Kimberly & Seth)====
*Kimberly & Seth Blocking issues list:
** <strike>When importing an annotation from AmiGO using a companion / buddy, source providedBy should be retained <br /> https://github.com/geneontology/noctua/issues/459 </strike>
** GAF exports
*** providedBy should be used in GPAD export <br /> https://github.com/geneontology/noctua/issues/502
*** ...
*Action Item Corvallis 2017/06:
**Provide ways for users to recover and digest GO-CAM units (Gene Ontology-based Causal Activity Model). Ideas include rule-based generations of text statements from model, cytoscape view of network described, etc.<br /><small>I believe that this is a more exploratory and open project. We now offer an RDF enpoint, in addition to the APIs we already have, as well as the proposed feeding from AmiGO. If this is still open in the future, it may be very good Hackathon material. Needs more discussion.</small>
**ECO codes available for use in Noctua should show how they map up to a classic GO code, and there should be an alert for curators when they are using a code that does NOT map up to a classic code <br /><small>There is no ticket for such functionality. This would likely have to be written into Minerva or as a pre/post check on (some) server.</small>
**PRO IDs for use in Noctua <br />https://github.com/geneontology/noctua/issues/429 <br />https://github.com/geneontology/noctua/issues/122
**Fix GPAD export from Noctua <br />https://github.com/geneontology/noctua/issues/418
**Add a SPARTA workbench <br />https://github.com/geneontology/noctua/issues/465
**Working group discussion of evidence on complex Noctua models (Kimberly?)


====Noctua table view demo (Chris)====


====Hands-on/Workshop Suggestions====
==Project updates ==
====Report on transcription work/Noctua templates: Astrid GREEKC consortium====
====Report of Reactome-GO connection: David H/Peter D====


* Displaying GO annotations (non redundantly, in MODs, in AmiGO), general data presentation.


* Good annotation practice: phenotype vs process, for eg.
=Evening: Poster session 17:00=


* Feedback for QuickGO beta (small group session with Sangya, 1h)
=Tuesday 3rd October 9:00 a.m.=


* Making tickets/revising annotations/overflow discussions from the signalling workshop session
==Project updates - continued ==
====AGR - report to GOC: PIs====
====SynGO meeting report: Paul T====
===Enrichment (Paul T & Suzi)===
*Web page
*Paul Pavlidis…
*Data Commons work


==GOC meeting Discussion topics==


* Sets of relevant genes for human diseases: (Judy - analysis report)
===Documentation and GitHub overview (Kimberly/Seth/David/Pascale)===
Kimberly Action Items Corvallis 2017/06
*Continue with the consolidation of all documentation for ontology editing, and remove all old documents.
*Review annotation documentation and add to github and readthedocs.
*Make sure to mark obsolete pages/doc as such and add a link to the new relevant doc
*Solicit annotation documentation from participating groups for consolidation.
*Make sure to mark docs with ‘Date last reviewed’ so it’s easy for users to know when the documentation was last touched.
*Pascale: additional information associated with GO terms, see GitHub ticket?
*Add and follow action items at https://github.com/orgs/geneontology/projects/3


* Funding opportunities for targeted annotation...
** human disease specific
** via DoE or NSF for microbes?
** metagenomics?
** getting <i>B. subtili</i>s from Peter Karp?


'''How GO now uses GitHub for project management and guidelines for contributors'''
*Where is information relevant to everyone's need
*Groups (Groups.yaml)
*Members
*Etc




==Project updates==
==Annotation guidelines and issues (part two)==


* AGR - report to GOC: PIs
===Centralization of InterPro2GO annotations (Paul T)===
 
Proposal (follow-up from Geneva 2016):  
* SynGO meeting report: Paul T
*GO database pulls directly from InterPro2GO for UniProt Reference Proteomes
 
*MOD identifier is used as primary gene identifier
* Report of Reactome-GO connection: David H/Peter D
*Annotations are given "contributed by" InterPro
*MODs pull from GO database, no need to maintain separate InterPro pipelines


* Report on transcription work/Noctua templates: Astrid GREEKC consortium
===HTP guidelines (Helen)===
Helen - 15 minutes Report on progress from HTP working group
https://docs.google.com/presentation/d/1oLkKWRSNQ-wWB_CCDxcb3a84hXsfkjl4Jp-kAEkhQUQ/edit#slide=id.g26c97bd257_1_57
*Draft Guidelines https://docs.google.com/document/d/1ScIeclAzUXMe-tU6n0lVfsSwMHpOeNb7uK8On9-iKXc/edit?usp=sharing%7CHTP


====AmiGO====
*Provision of new evidence code
Follow up with Seth
*Implementation & adding to guidelines
* In base_statistics, plotly graph for "Experimental annotation publications by assigner" is confusing https://github.com/geneontology/amigo/issues/429
(Pascale Q: My understanding is that there are essentially no resources for AmiGO right now so any AmiGO issue is low priority. Is this correct?) (Seth: Pretty much--all current work at this point is basically "hobby time")


==Transcription annotations decision tree==
Ruth Action Items Corvallis 2017/06
*David OS to create transcription regulator activity (proposed by Paul T)
*Proposed changes to decision tree in Corvallis:
**Simplified from previous version. Essentially a choice between ‘regulating transcription by RNA polymerase II’ or ‘regulating gene expression’
**Annotation 5 = contributes_to sequence-specific DNA binding
David: when people do enrichment, they don’t drop contributes_to (ie., pay attention to qualifiers), so all those proteins will come down as ‘DNA binding’ Action item: replace ‘annotation 5’ with ISS annotation (if DNA binding domain) or contributes to annotation 5 with ISS annotation if no DNA binding domain and domains to suggest coactivator
*
**Annotation 3 = nuclear chromatin
Not everyone comfortable with this (ex., Stacia, David) Shouldn’t it be ‘colocalizes_with’? It was previously agreed that the definition for nuclear chromatin: The ordered and organized complex of DNA, protein, and sometimes RNA, that forms the chromosome in the nucleus. Source:PMID:20404130 was to be applied. The proteins here include all associated proteins, not limited to just histones. With this statement the TFs are then contributing to chromatin, rather than binding to chromatin or colocalizing with it.
*
**Decision tree to be put up on GOC website.
**Should annotation 4 exist? (Ruth)


====PAINT====
==PAINT update (Huaiyu)==
Huaiyu
Huaiyu Action Items Corvallis 2017/06:
Action Items Corvallis 2017/06:  
*Encourage discussion between PAINT curators and other annotators about terms not used for propagation
* Encourage discussion between PAINT curators and other annotators about terms not used for propagation
*Report how many annotations per species are used for annotation propagation; could even supply this number for propagation specifically to human genes
* Report how many annotations per species are used for annotation propagation; could even supply this number for propagation specifically to human genes
*Ruth and Huaiyu (others?) will discuss making use of groups that have already annotated specific gene lists to annotate the corresponding PAINT families.
* Ruth and Huaiyu (others?) will discuss making use of groups that have already annotated specific gene lists to annotate the corresponding PAINT families.
*Smooth out the challenge mechanism to make it easier to do make and resolve
* Smooth out the challenge mechanism to make it easier to do make and resolve
the challenges, identify terms that may be problematic and would benefit from consistency exercises and discussion.
the challenges, identify terms that may be problematic and would benefit from consistency exercises and discussion.
* Get a list of families where terms have not been propagated (?) - Please check this one for clarity.
*Get a list of families where terms have not been propagated (?) - Please check this one for clarity.
* (added to github project board) Develop mechanism to trigger review of annotated PAINT families.
(added to github project board) Develop mechanism to trigger review of annotated PAINT families.
 
 
====Documentation====
Kimberly
Action Items Corvallis 2017/06
* Continue with the consolidation of all documentation for ontology editing, and remove all old documents.
* Review annotation documentation and add to github and readthedocs.
* Make sure to mark obsolete pages/doc as such and add a link to the new relevant doc
* Solicit annotation documentation from participating groups for consolidation.
* Make sure to mark docs with ‘Date last reviewed’ so it’s easy for users to know when the documentation was last touched.
* Pascale: additional information associated with GO terms, see GitHub ticket?
* Add and follow action items at https://github.com/orgs/geneontology/projects/3


==Action items & tickets from previous period==


===Noctua===
==Community Support==
===Citing GO (Paul T)===
*Web page
*Tool providers: Action Item Corvallis 2017/06: someone to represent GOC and contact GO tool providers to display version information and state this information needs to be included in any subsequent publication. Plus list the GOC paper to reference, as well as the tool provider reference. See: https://github.com/geneontology/go-site/issues/359 and https://github.com/geneontology/go-site/issues/360


====Getting Noctua ready for production====
Kimberly & Seth
Blocking issues list:
TO BE COMPLETED


Action Item Corvallis 2017/06:  
===GO_Slims (Val, Suzi, Mary)===
* Provide ways for users to recover and digest GO-CAM units (Gene Ontology-based Causal Activity Model).  Ideas include rule-based generations of text statements from model, cytoscape view of network described, etc. 
====WHY creating slims - Intro to slim philosophy====
* ECO codes available for use in Noctua should show how they map up to a classic GO code, and there should be an alert for curators when they are using a code that does NOT map up to a classic code
* Suzi: For genome overviews, for profiling an individual gene, for particular taxa, for a particular area of biology. Criteria for evaluating a slim
* PRO IDs for use in Noctua
* Val: Slimming tips [https://docs.google.com/presentation/d/1oXzxaJpkm46irdMA0fbKohUgcx5Qm2xkbraZ83ttPXw/edit?usp=sharing_eil&ts=59cd0de3 Slimming tips]
* Fix GPAD export from Noctua https://github.com/geneontology/noctua/issues/418
Includes Slim uses, Creating a biologically useful slim (complete coverage by aspect, biologically useful terms i.e sufficient granularity, avoiding single step process terms (i.e functions), different slims for different purposes) (Val: 20 mins)
* Add a SPARTA workbench https://github.com/geneontology/noctua/issues/465
Action Item Corvallis 2017/06:
* Working group discussion of evidence on complex Noctua models (Kimberly?)  


====Noctua table view demo====
====Use and maintenance of slims (Mary)====
Mary: algorithm - 15 minutes


====slim metadata(Chris)====
Yaml format for creating slims/types of slims/target organisms


https://docs.google.com/presentation/d/1NdOSZa3Y1h229dvX20pkGlbEnjAeVPmeo5NVZcyd10s/edit?usp=sharing


* https://github.com/geneontology/go-ontology/issues/12780
* https://github.com/geneontology/go-ontology/issues/14028
* https://github.com/geneontology/go-ontology/issues/12554


====GO presentation of slims====
Suzi: GO ribbon
Group discussion: how should we present slims in GO: Multiple or one ?
How many slims will be available? How will we represent multiple slims?
(Pascale asks: what are our resources for this? what is the priority? which slims is GO responsible for?)


===Ontology===
=Wednesday 4th October 9:00 a.m.=


====GitHub tutorial====
(Seth & Pascale): how GO now uses GitHub for project management and guidelines for contributors


==Brainstorming session==
See https://docs.google.com/presentation/d/1AYS7IaeyrdzQcl42huyooh2y00kAj6gUtaVS5v0czjg/edit#slide=id.g25a0d9ffc9_0_0


=====Update on MF refactoring (Pascale or PaulT)=====
==Review action items==




Action Items Corvallis 2017/06
==Parking tickets==
===AmiGO===
*In general, while several of the AmiGO issues are high-priority, there has been limited bandwidth to tackle them in the context of continuing work on Noctua and the replacement pipeline. Several fixes are queued up and will be available before the upcoming SAB meeting.
**In response to: in base_statistics, plotly graph for "Experimental annotation publications by assigner" is confusing https://github.com/geneontology/amigo/issues/429
**From Pascale's question: My understanding is that there are essentially no resources for AmiGO right now so any AmiGO issue is low priority. Is this correct?


=====BP refactoring=====
===The Fate of Simple Processes===
* Defining “Cellular Process” and “Multi-Organism Process” terms
*analysis of gene products annotated to phosphorylation but not annotated to a kinase MF term. What did these annotations actually mean? Curators would need to review. Timeline? Deadline? Working group? One possibility would be to use part_of/involved_in relation to phosphorylation for genes also annotated to kinase MF, and causally_upstream_of_or_within for others until curators can re-evaluate.
* Action item: the comments/examples/notes should be captured, working group to discuss this.
*Other considerations:
**Helen: are there many the single-step processes?
**Paul: user-oriented approach - is it a useful grouping for our users?
**Ruth: we need to think about the meaning beyond just “phosphorylation”
**Ruth: what are the consequences of removing “phosphorylation” and what happens to all its children?
**Pascale Q: providing that we remove the processes, would it affect the term enrichment analysis?


===BP refactoring===
*Defining “Cellular Process” and “Multi-Organism Process” terms
*Action item: the comments/examples/notes should be captured, working group to discuss this.


=====The Fate of Simple Processes=====
===Proteoforms===
* analysis of gene products annotated to phosphorylation but not annotated to a kinase MF term.  What did these annotations actually mean?  Curators would need to review. Timeline? Deadline? Working group? One possibility would be to use part_of/involved_in relation to phosphorylation for genes also annotated to kinase MF, and causally_upstream_of_or_within for others until curators can re-evaluate.
Establish a working group for when and how to use proteoforms in annotation. (Kimberly with Harold and Li)
* Other considerations:
** Helen: are there many the single-step processes?
** Paul: user-oriented approach - is it a useful grouping for our users?
** Ruth: we need to think about the meaning beyond just “phosphorylation”
** Ruth: what are the consequences of removing “phosphorylation” and what happens to all its children?
** Pascale Q: providing that we remove the processes, would it affect the term enrichment analysis?
 
 
=====Viral processes=====
Pascale
Action Items Corvallis 2017/06
Check whether the incorporated changes could affect the host proteins. Document the use cases.
 
===Usability issues===
===Usability issues===
* What do users want?  
*What do users want?
**truth (summary/story/model)?
**truth (summary/story/model)?
**fishing expeditions?
**fishing expeditions?
**api access?
**api access?
* Who are our users?  
** appropriate licensing
*Who are our users?
**geneticists?
**geneticists?
**clinicians?
**clinicians?
**computational biologists?
**computational biologists?
* Use cases: https://github.com/geneontology/go-ontology/issues/13606
***aggregators
* Perhaps discussion of examples of how the GOC members are engaging with the community and how we can do better in the future (suggested by Helen).
*Use cases: https://github.com/geneontology/go-ontology/issues/13606
*Perhaps discussion of examples of how the GOC members are engaging with the community and how we can do better in the future (suggested by Helen).
* Val: What happened to the user survey? PomBase collected 70 responses from our users but I never heard a summary of the findings at a subsequent meeting. Could somebody give an overview of the results?
 
==Addition items - If time allows==
Chris: Where should documentation on term usage go in the ontology? We'd like to create a new type of annotation in Protégé, but the details of the implementation need to be worked out.  


=Lower priority=


====Citing GO====
==Viral processes update (Pascale)==
Action Item Corvallis 2017/06:
Action Items Corvallis 2017/06 Check whether the incorporated changes could affect the host proteins. Document the use cases. '''Nothing to report - depends on where documentation will be hosted'''
* someone to represent GOC and contact GO tool providers to display version information and state this information needs to be included in any subsequent publication. Plus list the GOC paper to reference, as well as the tool provider reference. See: https://github.com/geneontology/go-site/issues/359 and https://github.com/geneontology/go-site/issues/360


=Postponed to next GOC meeting=
==Action Item Review==


====Quality control====
=Wednesday noon-ish Fin=
Action Item Corvallis 2017/06:
* Explore implementing periodic reanalysis of published gene sets and making this available to users to illustrate how the dynamic nature of GO (should) improve analysis.  P. Pavlidis’ http://gotrack.msl.ubc.ca/ may be particularly useful here.
https://github.com/geneontology/go-site/issues/361


[[Category: GO Consortium Meetings]]
[[Category: GO Consortium Meetings]]

Latest revision as of 02:43, 4 October 2017

GOC Meeting, Cambridge , October 2-4, 2017

To all speakers

Please add your slides to the google drive: https://drive.google.com/drive/folders/0B7bEr6HANSlGSHY3c2JqNEs2ZUk

Minutes

https://docs.google.com/document/d/1Y9_Mvqes3op36TPHgfaS7K5FHnGZLgUApghIFYyUKR8/edit#heading=h.638rbn3dlj1z

Monday 2nd October 9:00 a.m.

Welcome, overview, vision, introductions

GO PIs

GO handbook presentation

The Gene Ontology and the meaning of biological function (Paul T)

Translating research data into Gene Ontology annotations (Pascale)

(includes a discussion of the issue: background knowledge & protein domains https://github.com/geneontology/go-annotation/issues/1621)

Gene Ontology - annotation extensions (Ruth)

Coffee break 10:30

Ontology Issues and Updates

Update on MF refactoring (Pascale and Paul T)

Qualifiers/Relation issues (Kimberly and Chris)

Refresher on Qualifiers/Relations (Chris)

https://docs.google.com/presentation/d/1aSr1kguPM9im_SHWGVVDe-7T0PNyvJUnXhMZk54on6Y/edit?usp=sharing

Biological Process Relations: Use of Qualifiers in Legacy Annotations

Proposal: We will apply the general qualifier to all legacy annotations. Each group can provide more specific qualifiers if they have a mechanism to distinguish.

Regulates relations (Kimberly)

Adding new qualifiers for the relation between a gene/gene product and a GO term.

  • Corvallis discussion:* What should the default relation be? How will we handle regulation? Use a relation, involved in regulation of, or use the precomposed regulation term?

Ruth: There are now 3 ways to say the same thing: - involved_in_regulation_of X - involved_in X regulation - involved_in BP regulates(X) For annotation purposes and for our users we want one. DOS: Good point. These are semantically identical, but I agree we need to find a way to only have one: by convention for classic GO annotation and by filtering the output of inference for noctua output. Proposal: If a named regulation class exists: involved_in X regulation ...if not: involved_in {some BP} regulates(X) NOTE: If there was an annotation in Noctua such as ‘regulation of’ ‘very specific term’ and there was no term as ‘regulation of very specific term’ the GOC pipeline would create the annotation to the parent term: ‘regulation of less specific term’. Q: Is this implemented ?

Lunch 12:30

Molecular Function Relations

Multiple qualifiers for an annotation (Huaiyu)

Cellular Component Relations

CC component annotation guidelines (Kimberly)

  • CC component annotation: what does it mean ? Kimberly to do 1 proposal (out of 3 alternatives)

1. where the protein is active 2. two different meanings: enables or the right RO (part:of, ie just found there) 3. part_of (low information value!)

  • colocalizes with

Annotation guidelines and issues (part one)

Signaling (Kimberly and David)

  • Report from signaling workshop: David /Kimberly
  • Signaling: First attempt at Annotation consistency 2.0 - Kimberly to report on the approach and the outcome.

Discussion points: Limited participation (self-selected to participate). One recommendation may be to ask all active curators to participate, even at some low level https://github.com/orgs/geneontology/projects/

Ontology WG update: David

Coffee break 15:30

Overview of GO annotations/Noctua (Kimberly & Chris)

Getting Noctua ready for production (Kimberly & Seth)

  • Kimberly & Seth Blocking issues list:
  • Action Item Corvallis 2017/06:
    • Provide ways for users to recover and digest GO-CAM units (Gene Ontology-based Causal Activity Model). Ideas include rule-based generations of text statements from model, cytoscape view of network described, etc.
      I believe that this is a more exploratory and open project. We now offer an RDF enpoint, in addition to the APIs we already have, as well as the proposed feeding from AmiGO. If this is still open in the future, it may be very good Hackathon material. Needs more discussion.
    • ECO codes available for use in Noctua should show how they map up to a classic GO code, and there should be an alert for curators when they are using a code that does NOT map up to a classic code
      There is no ticket for such functionality. This would likely have to be written into Minerva or as a pre/post check on (some) server.
    • PRO IDs for use in Noctua
      https://github.com/geneontology/noctua/issues/429
      https://github.com/geneontology/noctua/issues/122
    • Fix GPAD export from Noctua
      https://github.com/geneontology/noctua/issues/418
    • Add a SPARTA workbench
      https://github.com/geneontology/noctua/issues/465
    • Working group discussion of evidence on complex Noctua models (Kimberly?)

Noctua table view demo (Chris)

Project updates

Report on transcription work/Noctua templates: Astrid GREEKC consortium

Report of Reactome-GO connection: David H/Peter D

Evening: Poster session 17:00

Tuesday 3rd October 9:00 a.m.

Project updates - continued

AGR - report to GOC: PIs

SynGO meeting report: Paul T

Enrichment (Paul T & Suzi)

  • Web page
  • Paul Pavlidis…
  • Data Commons work


Documentation and GitHub overview (Kimberly/Seth/David/Pascale)

Kimberly Action Items Corvallis 2017/06

  • Continue with the consolidation of all documentation for ontology editing, and remove all old documents.
  • Review annotation documentation and add to github and readthedocs.
  • Make sure to mark obsolete pages/doc as such and add a link to the new relevant doc
  • Solicit annotation documentation from participating groups for consolidation.
  • Make sure to mark docs with ‘Date last reviewed’ so it’s easy for users to know when the documentation was last touched.
  • Pascale: additional information associated with GO terms, see GitHub ticket?
  • Add and follow action items at https://github.com/orgs/geneontology/projects/3


How GO now uses GitHub for project management and guidelines for contributors

  • Where is information relevant to everyone's need
  • Groups (Groups.yaml)
  • Members
  • Etc


Annotation guidelines and issues (part two)

Centralization of InterPro2GO annotations (Paul T)

Proposal (follow-up from Geneva 2016):

  • GO database pulls directly from InterPro2GO for UniProt Reference Proteomes
  • MOD identifier is used as primary gene identifier
  • Annotations are given "contributed by" InterPro
  • MODs pull from GO database, no need to maintain separate InterPro pipelines

HTP guidelines (Helen)

Helen - 15 minutes Report on progress from HTP working group https://docs.google.com/presentation/d/1oLkKWRSNQ-wWB_CCDxcb3a84hXsfkjl4Jp-kAEkhQUQ/edit#slide=id.g26c97bd257_1_57

  • Provision of new evidence code
  • Implementation & adding to guidelines

Transcription annotations decision tree

Ruth Action Items Corvallis 2017/06

  • David OS to create transcription regulator activity (proposed by Paul T)
  • Proposed changes to decision tree in Corvallis:
    • Simplified from previous version. Essentially a choice between ‘regulating transcription by RNA polymerase II’ or ‘regulating gene expression’
    • Annotation 5 = contributes_to sequence-specific DNA binding

David: when people do enrichment, they don’t drop contributes_to (ie., pay attention to qualifiers), so all those proteins will come down as ‘DNA binding’ Action item: replace ‘annotation 5’ with ISS annotation (if DNA binding domain) or contributes to annotation 5 with ISS annotation if no DNA binding domain and domains to suggest coactivator

    • Annotation 3 = nuclear chromatin

Not everyone comfortable with this (ex., Stacia, David) Shouldn’t it be ‘colocalizes_with’? It was previously agreed that the definition for nuclear chromatin: The ordered and organized complex of DNA, protein, and sometimes RNA, that forms the chromosome in the nucleus. Source:PMID:20404130 was to be applied. The proteins here include all associated proteins, not limited to just histones. With this statement the TFs are then contributing to chromatin, rather than binding to chromatin or colocalizing with it.

    • Decision tree to be put up on GOC website.
    • Should annotation 4 exist? (Ruth)

PAINT update (Huaiyu)

Huaiyu Action Items Corvallis 2017/06:

  • Encourage discussion between PAINT curators and other annotators about terms not used for propagation
  • Report how many annotations per species are used for annotation propagation; could even supply this number for propagation specifically to human genes
  • Ruth and Huaiyu (others?) will discuss making use of groups that have already annotated specific gene lists to annotate the corresponding PAINT families.
  • Smooth out the challenge mechanism to make it easier to do make and resolve

the challenges, identify terms that may be problematic and would benefit from consistency exercises and discussion.

  • Get a list of families where terms have not been propagated (?) - Please check this one for clarity.

(added to github project board) Develop mechanism to trigger review of annotated PAINT families.


Community Support

Citing GO (Paul T)


GO_Slims (Val, Suzi, Mary)

WHY creating slims - Intro to slim philosophy

  • Suzi: For genome overviews, for profiling an individual gene, for particular taxa, for a particular area of biology. Criteria for evaluating a slim
  • Val: Slimming tips Slimming tips

Includes Slim uses, Creating a biologically useful slim (complete coverage by aspect, biologically useful terms i.e sufficient granularity, avoiding single step process terms (i.e functions), different slims for different purposes) (Val: 20 mins)

Use and maintenance of slims (Mary)

Mary: algorithm - 15 minutes

slim metadata(Chris)

Yaml format for creating slims/types of slims/target organisms

https://docs.google.com/presentation/d/1NdOSZa3Y1h229dvX20pkGlbEnjAeVPmeo5NVZcyd10s/edit?usp=sharing

* https://github.com/geneontology/go-ontology/issues/12780
* https://github.com/geneontology/go-ontology/issues/14028
* https://github.com/geneontology/go-ontology/issues/12554

GO presentation of slims

Suzi: GO ribbon Group discussion: how should we present slims in GO: Multiple or one ? How many slims will be available? How will we represent multiple slims? (Pascale asks: what are our resources for this? what is the priority? which slims is GO responsible for?)

Wednesday 4th October 9:00 a.m.

Brainstorming session

See https://docs.google.com/presentation/d/1AYS7IaeyrdzQcl42huyooh2y00kAj6gUtaVS5v0czjg/edit#slide=id.g25a0d9ffc9_0_0

Review action items

Parking tickets

AmiGO

  • In general, while several of the AmiGO issues are high-priority, there has been limited bandwidth to tackle them in the context of continuing work on Noctua and the replacement pipeline. Several fixes are queued up and will be available before the upcoming SAB meeting.
    • In response to: in base_statistics, plotly graph for "Experimental annotation publications by assigner" is confusing https://github.com/geneontology/amigo/issues/429
    • From Pascale's question: My understanding is that there are essentially no resources for AmiGO right now so any AmiGO issue is low priority. Is this correct?

The Fate of Simple Processes

  • analysis of gene products annotated to phosphorylation but not annotated to a kinase MF term. What did these annotations actually mean? Curators would need to review. Timeline? Deadline? Working group? One possibility would be to use part_of/involved_in relation to phosphorylation for genes also annotated to kinase MF, and causally_upstream_of_or_within for others until curators can re-evaluate.
  • Other considerations:
    • Helen: are there many the single-step processes?
    • Paul: user-oriented approach - is it a useful grouping for our users?
    • Ruth: we need to think about the meaning beyond just “phosphorylation”
    • Ruth: what are the consequences of removing “phosphorylation” and what happens to all its children?
    • Pascale Q: providing that we remove the processes, would it affect the term enrichment analysis?

BP refactoring

  • Defining “Cellular Process” and “Multi-Organism Process” terms
  • Action item: the comments/examples/notes should be captured, working group to discuss this.

Proteoforms

Establish a working group for when and how to use proteoforms in annotation. (Kimberly with Harold and Li)

Usability issues

  • What do users want?
    • truth (summary/story/model)?
    • fishing expeditions?
    • api access?
    • appropriate licensing
  • Who are our users?
    • geneticists?
    • clinicians?
    • computational biologists?
      • aggregators
  • Use cases: https://github.com/geneontology/go-ontology/issues/13606
  • Perhaps discussion of examples of how the GOC members are engaging with the community and how we can do better in the future (suggested by Helen).
  • Val: What happened to the user survey? PomBase collected 70 responses from our users but I never heard a summary of the findings at a subsequent meeting. Could somebody give an overview of the results?

Addition items - If time allows

Chris: Where should documentation on term usage go in the ontology? We'd like to create a new type of annotation in Protégé, but the details of the implementation need to be worked out.


Viral processes update (Pascale)

Action Items Corvallis 2017/06 Check whether the incorporated changes could affect the host proteins. Document the use cases. Nothing to report - depends on where documentation will be hosted

Action Item Review

Wednesday noon-ish Fin