Projects stand up meeting 2021-06-16

From GO Wiki
Jump to navigation Jump to search

Attendees

  • Present: David, Kimberly, Pascale, Seth, Chris, Huaiyu,
  • Regrets:
  • Managers: David, Huaiyu, Kimberly, Pascale, Seth, Suzi, Paul, Chris

Announcements

GOC meeting dates: Oct 13-15 (Wed to Friday) -> is that our final decision?

Discussion points

User stories / use cases

A user requests annotations for Chlorocebus sabaeus (Vervet - green monkey) in a .gaf (gzip) format (Helpdesk ticket: https://github.com/geneontology/helpdesk/issues/319). This is certainly something GO users expect we can deliver.

  • Issues:
    • We dont have IEAs, and anyway our plan was to fully load annotations from 142 species, and none from other species. Do we have any guidelines for this specific use case? Do we direct users to GOA? Can we set up a special API call to QuickGO?
    • Alternatively, could we generate files (even if only once a year) for many species, without them being in the 'main pipeline', but so that people could get data for their analyses ?
    • This might include Tree Grafter data, as this seems non-trivial to ask users to run this themselves

Release delays

Some delays are caused by upstream sources, and groups cannot always give a new file quickly. Is there a date in the month at which we should decide to go ahead with the release, and use the last correct file for that upstream?

Data load frequency in Noctua

  • Updates are infrequent in Noctua, especially for GO terms it would be very useful for curators to have more frequent data loads.
  • This is a long-standing issue and there's been discussion of this in a noctua ticket, as well as in meetings, jamborees, etc.: https://github.com/geneontology/noctua/issues/710
  • For now this is a manual process that can only be done by Seth. We would need to allocate resource to allow automation.
  • Decision: We (Seth) will load the ontology 2nd and 4th week of the month, on Thursdays.

Question from data contributor

Thomas Depuydt and Klaas Vandepoele contacted TAIR and Paul to ask how to annotate a study where they predicted the function of plant genes of unknown function. Title: Multi-omics network-based functional annotation of unknown Arabidopsis genes (link to draft manuscript here - please keep confidential) "we developed a novel automated function prediction (AFP) algorithm(...) Next, we predicted biological process annotations for known and unknown genes in A. thaliana and validated our predictions using experimental protein-DNA and protein-protein interaction data (...)

  • Proposal by Paul and Pascale was too use RCA. Klaas and Thomas also wanted to include which protein interactor supported the data, which is not allowed in RCA.
  • Decision: Tanya and Leonore will attend next week's managers call to discuss this.


Ongoing projects

MOD imports

General Issues

  • Import projects are progressing such that we will need a dedicated software developer to help with the GPAD output.
    • Sierra has raised her hand to help with this.
    • Decision: David, Kimberly and Li will review current GPAD outputs and create tickets for any problem identified.
    • Strategy: David and Kimberly will review existing tickets in minerva and triage them as many likely pertain to the more complex models and will not be relevant to the MOD imports. Then we will allocate resources to complete this subproject = https://github.com/orgs/geneontology/projects/52.


Pipeline

Priorities

https://docs.google.com/spreadsheets/d/19-i43w0mGQdFYmKXqskVIgrDO-pVTWo1YgjaWa5PI1o/edit#gid=68094648

GOC meeting Action Points

action items

  • Are any of these items high priority?