7 May 2019 PAINT Conference Call
Following up on last call's action items
- Huaiyu: get stats for the NOT annotations before May 7, 2018 and after May 8, 2018, to make sure that we didn't loose NOT annotations.
- Huaiyu looked into the problem:
the problem raised by Val is related to the IC evidence code created by a very early version of Suzi’s paint tool (on the ancestor node). It affects 10 families, with a total of 28 ancestor node annotations. All of them are related to the NOT annotations. Our system doesn’t expect an IC code, and doesn’t handle it. The paint tool only handles IBD, IBA, IKR and IRD. That is why you don’t see it in the tree. Our pipeline considers IC as a true annotation (with NOT) instead of STOP. That is why they are labeled as NOT in the gaf file and in Amigo. It is good that Val found the problem, but it is not something epidemic. Basically I need to go to these families, and manually change the annotation with IKR or IRD evidence code. -> Should be OK by the next release
- Huaiyu, Dustin, Pascale: Documentation of pipeline: https://github.com/pantherdb/fullgo_paint_update/issues/18
- Anushya: Taxon constraint bug for unrecognized species
- Anushya will change the logic so that unrecognized species will not generate notifications.
- Anushya: Record every action (every annotation/NOT, etc) that curator does when curating a tree in the Curator Notes
- Dustin: Figure out which trackers to use for PAINT feature requests and Panther family review requests
See ticket https://github.com/geneontology/go-annotation/issues/2335 This is a large helicase family that used to have annotations, but all annotations seem to have been lost during a Panther update.
- Are there other families that have lost ALL their annotations?
- Action Item: For the PANTHER 14 release they will produce a table with the diff of number of annotations per family between Panther13 and Panther14.
Updated curation guidelines
http://wiki.geneontology.org/index.php/PAINT_SOP Added the following two points:
- Small duplications
The curator should look out for small duplications; for example we often see duplications specifically in S. pombe, in which there are loss or gains of function, and hence both paralogs do not have the same function. Likewise for C. elegans, insects, and in plants.
- Use caution for the following terms
- Specific substrates, for enzymes, transport and transporter activity
The specificity of enzyme substrates and molecules being transported by transporters can evolve rapidly. Especially for large families with many duplications (which is often the case for transporters, such as the ABC transporters), use caution when propagating substrates. Be very conservative when annotating these.
Filtering 'protein binding' and other terms from PAINT
- Since we don't propagate 'protein binding', it should be filtered out from PAINT. The advantage is that families ONLY annotated to protein binding would not show up as curatable.
- There has also been a suggestion to create a flag in the ontology 'not to be used for sequence similarity-type annotations' or something along those lines. The context was viral terms (virus infection is species-specific), but it would be useful to exclude a few other terms. For example:
- response to x
ACTION ITEM We will start with 'protein binding' - we never want to propagate protein binding. Then we can add to the list.
Tool feature requests
- Links to AmiGO for all genes
The GOC error reports are by species, not by source. Therefore, it is very inconvenient for us to use these to fix errors. http://snapshot.geneontology.org/reports/gorule-report.html
Last GOC release stripped out all GOA annotations
So there are no IBAs for human in AmiGO ! See https://github.com/geneontology/go-site/issues/1061