Manager Call 2020-03-18
Agenda
- Agenda: Kimberly
- Minutes: Huaiyu
- Present: Kimberly Huaiyu, David, Chris, Pascale, Judy, Suzi, Paul T, Laurent-Phillip, Seth, Nomi
- Regrets:
Discussion points
GO Release
Alliance is ready to make next release (1.5 weeks behind)
- Needs new GO release to get correct GAF (affects auto descriptions)
- GO is waiting on WB(?) ETA?
- For WB, the fixes went in in January, but due to our release cycle, they won't make it to WB production until April
- We could perhaps do a one-time pull from the WB development site?
Proposed solutions:
- Ask Alliance to use old GO release (20191202 was last "good" for SGD)
- Alliance pushes forward with known bad data
- Alliance waits for GO release
- Other?
GO meeting Paris
- What do we need to do to prepare for a virtual meeting?
- Do we try a "node" organization ?
- What tech to use: zoom, goto meeting ?
- How to make those meetings more interactive / engaging ?
- mural as a shared dashboard ? (https://mural.co)
Proposed File Format Changes
- GPAD/GPI 2.0 specs released
- Issue to resolve: same protein sequence maps to different transcripts and genes - one entry in gpi or multiple?
- Propose one line for clarity (kmva)
- Examples from current WB and UniProtKB gpi files:
- Issue to resolve: same protein sequence maps to different transcripts and genes - one entry in gpi or multiple?
Database | Identifier | Symbol | Name | Synonyms | Type | Taxon | Parent Object ID | DB xrefs | Properties |
---|---|---|---|---|---|---|---|---|---|
WB | CE01192 | MSD-1 | Major Sperm protein Domain containing | CELE_F44D12.3 | protein | taxon:6239 |
WB:F44D12.3|WB:F44D12.5| WB:F44D12.7|WB:C35D10.11 |
UniProtKB:G5EEJ7 | |
UniProtKB | G5EEJ7 | msd-2 | MSP domain-containing protein |
msd-2|msd-1|msd-3| msd-4|C35D10.11|CELE_C35D10.11| CELE_F44D12.3|CELE_F44D12.5| CELE_F44D12.7|F44D12.3| F44D12.5|F44D12.7 |
protein | taxon:6239 |
WB:WBGene00009680|WB:WBGene00009682| WB:WBGene00009684|WB:WBGene00016447 |
db_subset=TrEMBL |
- Also, we proposed last week to make an incremental update to GAF to allow for all gp2term relations.
- Create specs for this and share internally first for comments?
- How soon do we need to begin preparing for downstream effects on GOC products, tools?
MGI and WB Imports
- Status updates
- WB
- Alex will work on preparing a test GPAD file with annotation history for WB - hopes to have it on the GOA ftp site some time next week
- MGI
- Have a periodic follow-up call to review progress?
- WB
Pathway Genome Databases and the Alliance
See Chris' email
- Explore bringing in other existing curated pathways in a similar way that we did for Reactome
- Follow-up to Alliance call last week with Peter K.?
Publications
- Val's matrix paper is available as a draft
- Authors have reviewed and made comments
- Do the GO PIs (besides Chris who is last author) also need to read?
Proposed data flow from Alex (GOA)
- Do we still need to discuss anything here?
- Based on decision at PI retreat to only show in AmiGO the 142 species also included in PANTHER, i.e. fewer species but all available annotations for those species.
- This would be a new pipeline to supplant the current UniProtGOA-all pipeline, i.e. the MODs will still be the source of their species annotations
1. All annotation sources/files are pulled to GO based on yaml files. (already implemented)
2. GOA pull them from GO based on yaml files data (partially implemented)
3. GOA perform all, so far known, sanity checks and constrains (already implemented)
4. GOA sends out reports to go-quality and group responsible for annotations (partially implemented)
5. GOA generate gpa/gpi (maybe gaf if you ask nicely) for 142 taxon for all annotations which are passed though step 3 (to be implemented)
6. GO using 142 taxon file for all their pipelines needs, no need to run their own set of checks, can be indexed directly to amigo and other tools. (to be implemented)