GAF Taxonomy Reasoning

From GO Wiki
Revision as of 20:29, 6 January 2010 by Cjm (talk | contribs)
Jump to navigation Jump to search

Algorithm

Report Format

Each line is in two parts. The part before the :: separator is the error report. The part after is the GAF line, copied verbatim.

The report part before the delimiter is

 <GO ID> "<Term name>" only_in <TAX ID> "<Taxon name>"

This is not ideal, as GO ID is redundant with col 5 of the GAF, which is repeated after the ::

ideally the report part would show the ID that is directly linked to the taxon - but it doesn't do this yet.

Example line:

 GO:0007595 "lactation" only_in NCBITaxon:40674 "Mammalia" :: Ensembl    ENSGALP00000029396      ENSGALP00000029396              GO:0007595      GO_REF:0000002  IEA     InterPro:IPR003626      P                       protein NCBITaxon:9

031 20091214 UniProtKB

This IEA from Chicken is erroneously associates a protein with "lactation", which is restricted to mammals.


Implementation

The script go-gaf-inference.pl is distributed as part pf GO Moose.