Scripts to ensure PAINT data integrity

From GO Wiki
Jump to: navigation, search

Purpose

This page lists items that we must check in PAINT-generated GAF files to make sure those are up-to-date and in agreement with GOC practices.

Evidence codes

  1. The IAS evidence code must be replaced by IBA. This should be a one-time operation as the new versions of PAINT do not assign IAS anymore.

Obsolete GO terms

  1. A report must be generated of all trees having IBA annotations to an obsolete term. That report must contain the (1) PTHR ID, (2) the ID of the node annotated to the obsolete term, and (3) the GO term string and ID.
  2. A report must be generated of all the IBA annotations for which the 'with' field contains a protein/gene ID associated with an obsolete GO.

Changes in tree topology

  1. A report must be generated that lists all node IDs for which the tree topology has a change that may impact how IBA annotations were attributed. This includes the obsolete of a family (e.g., PTHR11295 and PTHR32047 are no longer in PANTHER).