7 Jan 2020 PAINT Conference Call

From GO Wiki
Jump to: navigation, search

Present

Pascale, Huaiyu, Dustin, Anushya

Discussion points

Data issues

I think Dustin has some data/forward tracking issues he’d like to discuss: Dustin, would you please list those or send links to tickets ?

ID spaces

I would like to discuss the ID space that we use - for example for AspGD we are using UniProt, so Aspergillus has some annotations as ASPDG IDs, and IBAs as UniProt. We should be using a single ID space for each species. See for example:

Discussion:

  • Huaiyu: Consortium needs to provide guidelines for name spaces for each species.
  • Pascale: The dataset yaml files describe all sources that provide their own namespaces: https://github.com/geneontology/go-site/tree/master/metadata/datasets
  • We noted that in the two links above, the species is different: Aspergillus fumigatus Af293 (Taxid:330879) for IBA annotations and Aspergillus fumigatus (Taxid:746128)
  • From the 2018-04 (version used for v.14.1) RefProt README file that we use to map 5-letter code to taxon ID ASPFU->330879:

UP000002530 330879 ASPFU 9647 0 9652 Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) (Aspergillus fumigatus)


Pascale emailed Marek Jan 7, 2020

Taxon constraints in PAINT not working?

See PTHR31570: 'GO:0005813 centrosome' should not be propagatable to plants and fungi, but when you propagate 'centrosome' to the root there is not error message, the term is propagated.


Discussion

  • PAINT taxon constraints matrix: http://data.pantherdb.org/TaxonConstraints/TaxonConstraintsLookup.txt
    • 1=allowed; 0=not allowed
  • The taxon constraint file is correct.
  • When we propagate, PAINT only checks the node you are annotating to, and this is why this node was missed.
  • We need to have some code that will look at descendant nodes and taxon constraints (as well as when there are changes to the tree topology)
  • PLAN:
    1. Need to create a taxon constraint NOT so that we cannot propagate
    2. We need a new evidence code to capture that (AI Pascale: request from ECO) - for now use 'TCV'
    3. PAINT code needs to be fixed to look at children and give error messages/prevent propagation
    4. Some code needs to be written to deal with updates and changes in tree topology