GAF Taxonomy Reasoning
Algorithm
Report Format
Each line is in two parts. The part before the :: separator is the error report. The part after is the GAF line, copied verbatim.
The report part before the delimiter is
<GO ID> "<Term name>" only_in <TAX ID> "<Taxon name>"
This is not ideal, as GO ID is redundant with col 5 of the GAF, which is repeated after the ::
ideally the report part would show the ID that is directly linked to the taxon - but it doesn't do this yet.
Example line:
GO:0007595 "lactation" only_in NCBITaxon:40674 "Mammalia" :: Ensembl ENSGALP00000029396 ENSGALP00000029396 GO:0007595 GO_REF:0000002 IEA InterPro:IPR003626 P protein NCBITaxon:9
031 20091214 UniProtKB
This IEA from Chicken is erroneously associates a protein with "lactation", which is restricted to mammals.
Implementation
The script go-gaf-inference.pl is distributed as part pf GO Moose.