Projects stand up meeting 2021-06-16
- Present: David, Kimberly, Pascale, Seth, Chris, Huaiyu,
- Managers: David, Huaiyu, Kimberly, Pascale, Seth, Suzi, Paul, Chris
GOC meeting dates: Oct 13-15 (Wed to Friday) -> is that our final decision?
User stories / use cases
A user requests annotations for Chlorocebus sabaeus (Vervet - green monkey) in a .gaf (gzip) format (Helpdesk ticket: https://github.com/geneontology/helpdesk/issues/319). This is certainly something GO users expect we can deliver.
- We dont have IEAs, and anyway our plan was to fully load annotations from 142 species, and none from other species. Do we have any guidelines for this specific use case? Do we direct users to GOA? Can we set up a special API call to QuickGO?
- Alternatively, could we generate files (even if only once a year) for many species, without them being in the 'main pipeline', but so that people could get data for their analyses ?
- This might include Tree Grafter data, as this seems non-trivial to ask users to run this themselves
Some delays are caused by upstream sources, and groups cannot always give a new file quickly. Is there a date in the month at which we should decide to go ahead with the release, and use the last correct file for that upstream?
- Decision: we give groups 5 working days to fix any issues we find. If the issue is 'blocking', we may decide to use the previous version of the upstream's data. If they expect the fix will take more than 5 days, we go ahead with the release.
- Documented in http://wiki.geneontology.org/index.php/Release_Pipeline#Manual_QC_step_during_GO_data_release_process and https://docs.google.com/document/d/1xzEwyEON6LqgMFe_Sjb1Fa-B-gYBVXGIfnEjP2656mo/edit#
Data load frequency in Noctua
- Updates are infrequent in Noctua, especially for GO terms it would be very useful for curators to have more frequent data loads.
- This is a long-standing issue and there's been discussion of this in a noctua ticket, as well as in meetings, jamborees, etc.: https://github.com/geneontology/noctua/issues/710
- For now this is a manual process that can only be done by Seth. We would need to allocate resource to allow automation.
- Decision: We (Seth) will load the ontology 2nd and 4th week of the month, on Thursdays.
Question from data contributor
Thomas Depuydt and Klaas Vandepoele contacted TAIR and Paul to ask how to annotate a study where they predicted the function of plant genes of unknown function. Title: Multi-omics network-based functional annotation of unknown Arabidopsis genes (link to draft manuscript here - please keep confidential) "we developed a novel automated function prediction (AFP) algorithm(...) Next, we predicted biological process annotations for known and unknown genes in A. thaliana and validated our predictions using experimental protein-DNA and protein-protein interaction data (...)
- Proposal by Paul and Pascale was too use RCA. Klaas and Thomas also wanted to include which protein interactor supported the data, which is not allowed in RCA.
- Decision: Tanya and Leonore will attend next week's managers call to discuss this.
- Import projects are progressing such that we will need a dedicated software developer to help with the GPAD output.
- Sierra has raised her hand to help with this.
- Decision: David, Kimberly and Li will review current GPAD outputs and create tickets for any problem identified.
- Strategy: David and Kimberly will review existing tickets in minerva and triage them as many likely pertain to the more complex models and will not be relevant to the MOD imports. Then we will allocate resources to complete this subproject = https://github.com/orgs/geneontology/projects/52.
- http vs https for ORCIDS
- We should standardize this in Noctua and the users.yaml; we're starting to see issues crop up with the imports
- Decision: YES - Seth created a ticket - https://github.com/geneontology/go-site/issues/1708
- Improve downloads
- Discussion: need reviewing by Chris
GOC meeting Action Points
- Are any of these items high priority?