Manager Call 2020-08-12
- Agenda: David
- Minutes: Kimberly
- Present: Chris, David, Huaiyu, Judy, Kimberly, Laurent-Philippe, Pascale, Paul, Seth, Suzi
Long term release for GO and Panther ?
Huaiyu got feedback from users they would want to be able to reproduce their results either for double checking or for an article review. To some extent, for some users/usages it's possible GO is evolving too fast. This is linked to our discussions with Versioned Enrichment and there are multiple ways to solve it, eg:
- have a long term release (LTR, frequent in software dev) updated maybe only every 6 or 12 months. This would be in parallel of the monthly release so a user could access both but the LTR would be more stable for citation/reproducibility over longer period of time
- always keep online the last 2-3 versions of the release, so a user could refer to it (not the most cost efficient but easy to achieve)
- have a system to automatically spin up any version of GO + Panther on demand. Possible, could be cost efficient (eg accumulate batch over the day and launch them over a time over a small period of time), but would require some devs
- Panther has a monthly release, but this can be confusing to users if they get different analysis results from one month to the next
- We want GO and Panther to be in sync as much as possible, so this should be a shared decision
- Is a monthly release too frequent for users, i.e. too many changes?
- Do we have any examples of analyses that changed enough to warrant communication with us?
- Are less frequent releases easier to archive?
- Can we have parallel release pipelines, e.g. monthly to reflect activity vs less frequent releases to facilitate reproducibility of analyses?
- If we had parallel releases, we'd need to think through the behavior of tools and users to minimize potential confusion
- Another point: coordinating releases with other groups, like the Alliance
- Could provide an R package or a web interface to allow people to re-run their analyses
- No immediate resolution, but we agree that this would be a good issue to raise with the GO SAB at the October meeting
- We need to supply the SAB with the pros and cons of the different approaches
Basic page created with google doc to start filling agenda: http://wiki.geneontology.org/index.php/October_2020_Remote_GOC_Meeting_Logistics
Meeting folders in GO's Google drive
User meeting, 3 half days
From last meeting, users preference were:
- enrichment analysis
- data access and integration
- hands-on, jupyter notebooks
If it is up to a general audience, then we can customize it however we want. What do we want to communicate to a broader audience? Laurent-Philippe will contact Michelle.
- Proposal to have a GO-CAM curation jamboree sometime this fall
- How critical will it be to have Noctua on a more secure production footing by this time?
Not sure yet before or after GO meeting. Possible concern with the fire season.
Kimberly will format the spreadsheet a bit more and then send out to GOC members to add their suggestions (by early September).
Goal will be to pick 4-5 related processes to model during the jamboree that will provide a good cross-section of biology.
Prior to jamboree:
- Create a documented demo model for each type of process included in the jamboree
- Shore up Noctua and relations documentation
Name: GO-CAMboree ?
There have been a few meetings about GPI content since last week
Form a working group to discuss interoperability of entity file formats (Alliance + others, e.g. PRO, UniProt)?
Full SVN release translation: https://geneontology-tmp.s3.amazonaws.com/index.html#releases-full/
Working on archive.geneontology.org (1.2Tb) and looking at CVS (no backup at the moment)
- Laurent-Philippe working on this, can add archived data (annotations and ontology) from early-mid 2000s
- There is a github repo for this
Moving groups to Noctua (#3)
Kimberly has a GPAD file with full history and looking into loading this David/Lori have posted the mgi GPI 2.0 file on our test site for the GOC group to pick up.
- David will review the latest MGI gpi file containing protein-containing complexes.
- Lori would like feedback on the file from GOC developers when their parsers are ready.
Specs here to discuss: https://docs.google.com/document/d/1iaqE1xEhvgT0OxfF8YUN36Gii962mQQ-ouTlP8OppGM/edit#heading=h.gx5r3bwu5gee Converging on something stable with general approval
- Development is on track, still have a few issues to discuss, but are anticipating a working prototype by October GOC meeting
GO pipeline solidified / Infrastructure / DevOps (#4)
Discussion on taxon constraints ? Anything else to report ?
Jim has created a plugin for Protege that allows editors to see the inferred taxon constraints.
- Pascale will talk with Jim about status of this and possibly implementing more taxon constraints.
- Jim's plugin helps ontology editors visualize taxon constraints more easily.
- We do want to show taxon constraints in AmiGO when we have resources to do this.
- These issues are addressed; August release has been approved.