Manager Call 2020-08-12: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
 
(24 intermediate revisions by the same user not shown)
Line 3: Line 3:
* Agenda: David
* Agenda: David
* Minutes: Kimberly
* Minutes: Kimberly
* Present:  
* Present: Chris, David, Huaiyu, Judy, Kimberly, Laurent-Philippe, Pascale, Paul, Seth, Suzi


==Long term release for GO and Panther ? ==
== Long term release for GO and Panther ? ==
Huaiyu got feedback from users they would want to be able to reproduce their results either for double checking or for an article review. To some extent, for some users/usages it's possible GO is evolving too fast. This is linked to our discussions with Versioned Enrichment and there are multiple ways to solve it, eg:
Huaiyu got feedback from users they would want to be able to reproduce their results either for double checking or for an article review. To some extent, for some users/usages it's possible GO is evolving too fast. This is linked to our discussions with Versioned Enrichment and there are multiple ways to solve it, eg:
* have a long term release (LTR, frequent in software dev) updated maybe only every 6 or 12 months. This would be in parallel of the monthly release so a user could access both but the LTR would be more stable for citation/reproducibility over longer period of time
* have a long term release (LTR, frequent in software dev) updated maybe only every 6 or 12 months. This would be in parallel of the monthly release so a user could access both but the LTR would be more stable for citation/reproducibility over longer period of time
* always keep online the last 2-3 versions of the release, so a user could refer to it (not the most cost efficient but easy to achieve)
* always keep online the last 2-3 versions of the release, so a user could refer to it (not the most cost efficient but easy to achieve)
* have a system to automatically spin up any version of GO + Panther on demand. Possible, could be cost efficient (eg accumulate batch over the day and launch them over a time over a small period of time), but would require some devs
* have a system to automatically spin up any version of GO + Panther on demand. Possible, could be cost efficient (eg accumulate batch over the day and launch them over a time over a small period of time), but would require some devs
 
**Panther has a monthly release, but this can be confusing to users if they get different analysis results from one month to the next
**We want GO and Panther to be in sync as much as possible, so this should be a shared decision
**Is a monthly release too frequent for users, i.e. too many changes?
**Do we have any examples of analyses that changed enough to warrant communication with us?
**Are less frequent releases easier to archive?
**Can we have parallel release pipelines, e.g. monthly to reflect activity vs less frequent releases to facilitate reproducibility of analyses?
***If we had parallel releases, we'd need to think through the behavior of tools and users to minimize potential confusion
**Another point: coordinating releases with other groups, like the Alliance
**Could provide an R package or a web interface to allow people to re-run their analyses
**No immediate resolution, but we agree that this would be a good issue to raise with the GO SAB at the October meeting
***We need to supply the SAB with the pros and cons of the different approaches


==GO Baltimore==
==GO Baltimore==
Line 56: Line 66:
Form a working group to discuss interoperability of entity file formats (Alliance + others, e.g. PRO, UniProt)?
Form a working group to discuss interoperability of entity file formats (Alliance + others, e.g. PRO, UniProt)?


 
==Legacy data==
===Legacy data===
Full SVN release translation: https://geneontology-tmp.s3.amazonaws.com/index.html#releases-full/
Full SVN release translation: https://geneontology-tmp.s3.amazonaws.com/index.html#releases-full/


Working on archive.geneontology.org (1.2Tb) and looking at CVS (no backup at the moment)
Working on archive.geneontology.org (1.2Tb) and looking at CVS (no backup at the moment)


===Moving groups to Noctua (#3)===
*Laurent-Philippe working on this, can add archived data (annotations and ontology) from early-mid 2000s
*There is a github repo for this
 
==Moving groups to Noctua (#3)==


Kimberly has a GPAD file with full history and looking into loading this  
Kimberly has a GPAD file with full history and looking into loading this  
David/Lori have posted the mgi GPI 2.0 file on our test site for the GOC group to pick up.
David/Lori have posted the mgi GPI 2.0 file on our test site for the GOC group to pick up.
*David will review the latest MGI gpi file containing protein-containing complexes.
*Lori would like feedback on the file from GOC developers when their parsers are ready.


===ART===
==ART==
Specs here to discuss: https://docs.google.com/document/d/1iaqE1xEhvgT0OxfF8YUN36Gii962mQQ-ouTlP8OppGM/edit#heading=h.gx5r3bwu5gee
Specs here to discuss: https://docs.google.com/document/d/1iaqE1xEhvgT0OxfF8YUN36Gii962mQQ-ouTlP8OppGM/edit#heading=h.gx5r3bwu5gee
Converging on something stable with general approval
Converging on something stable with general approval


===GO pipeline solidified / Infrastructure / DevOps (#4)===
*Development is on track, still have a few issues to discuss, but are anticipating a working prototype by October GOC meeting
 
==GO pipeline solidified / Infrastructure / DevOps (#4)==
Discussion on taxon constraints ? Anything else to report ?
Discussion on taxon constraints ? Anything else to report ?


Jim has created a plugin for Protege that allows editors to see the inferred taxon constraints.
Jim has created a plugin for Protege that allows editors to see the inferred taxon constraints.


===GO release===
*Pascale will talk with Jim about status of this and possibly implementing more taxon constraints.
*Jim's plugin helps ontology editors visualize taxon constraints more easily.
*We do want to show taxon constraints in AmiGO when we have resources to do this.
 
==GO release==
Some issues:
Some issues:
* https://github.com/geneontology/go-site/issues/1541
* https://github.com/geneontology/go-site/issues/1541
* https://github.com/geneontology/amigo/issues/602
* https://github.com/geneontology/amigo/issues/602


===GO-CAM Jamboree===
*These issues are addressed; August release has been approved.
 
 





Latest revision as of 11:56, 12 August 2020

Agenda

  • Agenda: David
  • Minutes: Kimberly
  • Present: Chris, David, Huaiyu, Judy, Kimberly, Laurent-Philippe, Pascale, Paul, Seth, Suzi

Long term release for GO and Panther ?

Huaiyu got feedback from users they would want to be able to reproduce their results either for double checking or for an article review. To some extent, for some users/usages it's possible GO is evolving too fast. This is linked to our discussions with Versioned Enrichment and there are multiple ways to solve it, eg:

  • have a long term release (LTR, frequent in software dev) updated maybe only every 6 or 12 months. This would be in parallel of the monthly release so a user could access both but the LTR would be more stable for citation/reproducibility over longer period of time
  • always keep online the last 2-3 versions of the release, so a user could refer to it (not the most cost efficient but easy to achieve)
  • have a system to automatically spin up any version of GO + Panther on demand. Possible, could be cost efficient (eg accumulate batch over the day and launch them over a time over a small period of time), but would require some devs
    • Panther has a monthly release, but this can be confusing to users if they get different analysis results from one month to the next
    • We want GO and Panther to be in sync as much as possible, so this should be a shared decision
    • Is a monthly release too frequent for users, i.e. too many changes?
    • Do we have any examples of analyses that changed enough to warrant communication with us?
    • Are less frequent releases easier to archive?
    • Can we have parallel release pipelines, e.g. monthly to reflect activity vs less frequent releases to facilitate reproducibility of analyses?
      • If we had parallel releases, we'd need to think through the behavior of tools and users to minimize potential confusion
    • Another point: coordinating releases with other groups, like the Alliance
    • Could provide an R package or a web interface to allow people to re-run their analyses
    • No immediate resolution, but we agree that this would be a good issue to raise with the GO SAB at the October meeting
      • We need to supply the SAB with the pros and cons of the different approaches

GO Baltimore

Basic page created with google doc to start filling agenda: http://wiki.geneontology.org/index.php/October_2020_Remote_GOC_Meeting_Logistics

Meeting folders in GO's Google drive

User meeting, 3 half days

From last meeting, users preference were:

- enrichment analysis

- data access and integration

- hands-on, jupyter notebooks

- ontology

- curation

If it is up to a general audience, then we can customize it however we want. What do we want to communicate to a broader audience? Laurent-Philippe will contact Michelle.

GO-CAM Jamboree

  • Proposal to have a GO-CAM curation jamboree sometime this fall
  • How critical will it be to have Noctua on a more secure production footing by this time?

Not sure yet before or after GO meeting. Possible concern with the fire season.

Spreadsheet for collecting suggested BPs to model

Kimberly will format the spreadsheet a bit more and then send out to GOC members to add their suggestions (by early September).

Goal will be to pick 4-5 related processes to model during the jamboree that will provide a good cross-section of biology.

Prior to jamboree:

  1. Create a documented demo model for each type of process included in the jamboree
  2. Shore up Noctua and relations documentation

Name: GO-CAMboree ?

GPAD/GPI

There have been a few meetings about GPI content since last week

Form a working group to discuss interoperability of entity file formats (Alliance + others, e.g. PRO, UniProt)?

Legacy data

Full SVN release translation: https://geneontology-tmp.s3.amazonaws.com/index.html#releases-full/

Working on archive.geneontology.org (1.2Tb) and looking at CVS (no backup at the moment)

  • Laurent-Philippe working on this, can add archived data (annotations and ontology) from early-mid 2000s
  • There is a github repo for this

Moving groups to Noctua (#3)

Kimberly has a GPAD file with full history and looking into loading this David/Lori have posted the mgi GPI 2.0 file on our test site for the GOC group to pick up.

  • David will review the latest MGI gpi file containing protein-containing complexes.
  • Lori would like feedback on the file from GOC developers when their parsers are ready.

ART

Specs here to discuss: https://docs.google.com/document/d/1iaqE1xEhvgT0OxfF8YUN36Gii962mQQ-ouTlP8OppGM/edit#heading=h.gx5r3bwu5gee Converging on something stable with general approval

  • Development is on track, still have a few issues to discuss, but are anticipating a working prototype by October GOC meeting

GO pipeline solidified / Infrastructure / DevOps (#4)

Discussion on taxon constraints ? Anything else to report ?

Jim has created a plugin for Protege that allows editors to see the inferred taxon constraints.

  • Pascale will talk with Jim about status of this and possibly implementing more taxon constraints.
  • Jim's plugin helps ontology editors visualize taxon constraints more easily.
  • We do want to show taxon constraints in AmiGO when we have resources to do this.

GO release

Some issues:

  • These issues are addressed; August release has been approved.