Manager Call 2020-11-04

From GO Wiki
Jump to: navigation, search
  • Present: Chris, David, Huaiyu, Laurent-Philippe, Kimberly, Pascale, Seth, Suzi, Paul
  • Regrets: Judy
  • Managers: Chris, David, Huaiyu, Laurent-Philippe, Kimberly, Pascale, Seth, Suzi, Judy, Paul


Agenda

Follow up on ACTION ITEMS

  • ncRNA annotations, id space, etc. Suggestion is to involve RNAcentral in this. RNAcentral has their consortium meeting on Nov. 20th
    • Pascale/Paul contact RNAcentral (Anton Petrov): to understand/document their process and see whether we should attend the consortium meeting
    • Not yet done
  • Deadline for GAF2.2': Kimberly to generate test files & send the announcement to go-friends and go-announcement repo and biostars and others (also need to have a 2.2 -> 2.1 converter)
  • Retrofitting gp2term relations: Kimberly: this is pretty urgent, should be done before the next grant: Kimberly/David to propose deliverables and try to estimate resources so we can prioritize this
    • Discussion:
      • Paul suggests that we do the retrofitting in the next grant. The goal was to get groups to start thinking about their process
      • Chris wonder if there is a partial path, for example using IBAs to repair existing annotations ? David is not comfortable with this
      • Paul says it's OK that IBA or ISS use a more specific relation because that second annotation process is a new annotation and may be using a higher confidence annotation relation
      • Discussion was that the IBAs will anyway provide the annotations with the more specific gp2term relation. The annotations should then not be considered redundant
      • Kimberly summarizes that we will not right now try to harmonize the gp2term relation across orthologs (or even for a single gene).
      • -> Right. We will put this in the next grant.
        • AI: Pascale: add 'retrofitting' to the grant ideas DONE
        • AI: Kimberly: adjust announcement as appropriate


  • Rhea2GO mappings: Pascale to contact Alex with help if needed from Alan, Jim, Anne, Harold, Peter, others DONE in progress
  • syngo: uniprot and MOD harmonization
    • Problem is that there is more than 1 ID space for a species due to SynGO annotating UniProt entries for MODs. We need to map to MOD IDs to make it consistent with other data.
    • Paul talked with SynGO - outcome -> they have used some UniProt (maybe even TrEMBL identifiers) to describe isoforms. Paul need to check again why they didn't used specific UniProt isoforms (xxxxx-n). The best dataflow would be for all SynGO data to go through P2GO so that it gets exported with MOD IDs.

Project updates

Blocking issues/delays/changes in project scope

Anyone has blocking issues/delays/changes in project scope they need to report to managers /PIs?


Review outstanding action items

GAF 2.2 (#4h) and GPAD/GPI 2.0 (#4i) (Kimberly)

  • We will need to dedicate more resources to getting GAF 2.2 out the door (smoothly). Who should this be? Kimberly, Pascale, Seth, Eric???

https://github.com/geneontology/go-site/issues/1558

  • Urgent: we need to indicate what relations to use with asserted root annotations to BP and CC.
    • DECISION: Most consistent with our data model is that all genes enables some MF and all MF are part of some BP => involved_in and CC 0> is active in
  • Action Item Seth: Create a rule to repair the annotations that would be submitted as GAF2.1
  • Action Item Seth: Mention we'll provide a GAF 2.2 -> 2.1 converter & provide the converter
  • Action Item Seth: Look at OWL tools to see if they can handle extra relations
    • Not expected to be an issue, we'll see when we load in snapshot
  • Action Item Laurent-Philippe: Check that there would be no impact on the API
    • Needs to be checked
  • Action Item Kimberly: Create example GAF2.2 (~10 lines) - cover all examples/combinations of qualifiers and relations

Projects updates

2020 Priorities: https://docs.google.com/document/d/1URLkEQBewlzD-NVAjbPLWD5Gu-5xq0RysTrm1GMbuiM/edit#

Creating a static page for SPARQL endpoint

Mentioned in the NAR paper - this needs to be prioritized and resourced: https://github.com/geneontology/go-site/issues/1549

  • Laurent-Philippe says he's not sure it works for programmatic access, needs to be checked
  • Chris says it's important to do this since we mention it in the paper
  • Need to make the static landing page

Legacy data (#6)

Update Laurent-Philippe https://geneontology-test.s3.amazonaws.com/index.html


Discussion Nov 4:

  • Still need to copy data on geneonology archive
  • Need to agree on file explorer page
  • update website
  • update documentation
  • roll into the pipeline


Deliverables

  • Ontology and annotations for as many prior releases as possible, available in /releases and on the GO archive in zenodo
    • GO archive: 2002-2011 (CVS) ; 2011-2018 (SVN)
      • OBO files: gene_ontology.obo (1.0 start 2004-02), gene_ontology_ext.obo = current go.obo (start 2009-03) ; discard gene_ontology-edit, gene_ontology-write, gene_ontology-1.2.obo
      • Slims: obo slims from mid 2004, discard "archived slims" which started in 2003
      • Adding all mysql dumps from archive.geneontology.org in mysql_dumps/
    • Note: a lot of jcvi GAFs from 2004-04 to 2011-03 (CVS) ; SVN: 2011-04 = single file to now; release.geneontology: 2018 to 2019-11 (not present after)
  • For these prior releases, up to when current release started, simple ontology files, and annotation files for top organisms (12 “reference genome” species) available for easy download from GO website (has not been prioritized yet)


GO reference species

Data set is twice as large as what we currently load, which slows down the pipeline significantly Will look into optimization

END OF CALL

Noctua Imports (#3a)

Action items: Kimberly and David:

  • finalize annotation properties specs GPAD2.0 (metadata/history)
  • produce examples of issues with GPAD output
  • draft an agenda for a working meeting (~2days) to defines the requirements for GPAD output
  • priorities those issues to have different milestones and a better vision on when we could work with those ? (eg when for a 80% or 90% solution)

Next steps:

  1. New MGI/WB files to be QC'ed (David & Kimberly)
  2. Finalize history/annotation properties GPAD2.0 specs
  3. Sp


Noctua (#3e)

  • ui implementations
    • bug fixes
      • Ongoing; Tremayne is submitting some PRs to noctua-dev for testing
      • Adding: connect to existing nodes; re-use of fields in a given model
    • ART
      • Tremayne has a prototype that David and Kimberly can test. The prototype has much of the desired functionality but is not yet employing the necessary functionality from minerva which will be developed later when Ben is available to work on that.

https://github.com/geneontology/noctua-annotation-review/issues/21

  • testing and release framework

Minerva support for noctua


Pipeline (ongoing) (#4)

  • 4a. Dustin is primary point of contact for pipeline failures, by December (Seth)

-> how is this going ?

  • 4b. Analysis of failure points in the pipeline and a plan to address them
  • 4c. Detection of large changes from last release, to find upstream issues earlier (Laurent-Philippe) snapshot -> prioritize ?

4d. Drop uniprot_noiea, add GO-reference-species


Pathways2GO paper (#1bi)