InterPro domains InterPro2GO

From GO Wiki
Jump to navigation Jump to search

Method

InterPro is an integrated resource of protein families, domains and sites which are combined from a number of different protein signature databases, including. Gene3D, Panther, PRSF, Pfam, PRINTS, ProSite, ProDom, SMART, SUPERFAMILY and TIGRFAMs.

Signatures describing the same protein family or domain are grouped into unique InterPro entries. This InterPro resource is then applied across the UniProt KnowledgeBase, and all UniProtKB protein sequences that have matches to a particular InterPro entry are cross-referenced.

Where an InterPro entry hits a set of functionally similar proteins, GO terms describing the conserved function or location are associated with the InterPro entry.

The InterPro2GO file is generated manually by the InterPro team at the EBI. To generate this table, curators compare InterPro and protein entries and for matching entries they;

  • Look at the statistics on DE lines, keywords and comments
  • Check how conserved the common annotation is
  • Look for an appropriate GO term at the most specific level to be relevant to all proteins in that family

The mapping file is then used to assign annotations to UniProtKB proteins at each UniProtKB-GOA release. GO annotations using this technique receive the evidence code Inferred from Electronic Annotation (IEA).

This method has been evaluated at 91-100% accurate (Camon et. al. 2005).

GO Reference

The GO reference for this method is GO_REF:0000002. Abstracts for all GO references can be seen here.

Mapping file

The mapping between InterPro domains and GO terms is available at: http://www.geneontology.org/external2go/interpro2go.

Example

The InterPro domain IPR001095 is the alpha subunit of the acetyl coenzyme A carboxylase complex. Proteins with this domain have been shown to have acetyl-CoA carboxylase activity, therefore this domain has been mapped to the GO term ‘acetyl-CoA carboxylase activity’ (GO:0003989). Any protein which contains this domain will automatically be assigned the GO term ‘acetyl-CoA carboxylase activity’.

Annotations

The annotations created by InterPro2GO mapping are displayed in the UniProtKB-GOA gene association files (Fig. 1), the InterPro domain identifier of the annotation source will be indicated in column 8 ('With') and column 6 (DB:Reference) will indicate that this method has the GO reference: GO_REF:0000002.

InterPro2GO annotations can also be viewed in QuickGO.

Figure 1. Representation of an InterPro2GO annotation in the gene association file.

References

Camon et. al. (2005) An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics 6 Suppl. 1:S17