Evidence Codes that Can include 'With' field information
Annotations that use certain evidence codes can contain information in the 'with' field (column 8):
Where an evidence code has not been listed it is assumed that no value should be included in the 'with'.
|Evidence Code||Mandatory or Optional?||Expected values|
|IPI||Mandatory||Protein, Gene, RNA, Chemical identifiers|
|IGI||Optional||Protein or Gene identifiers|
|IMP||Optional||Alleles, Morpholinos. In some cases, two alleles are present, but not to the same gene; one is an allele of the gene in question, but is a conditional allele that requires a second allele (usually a Cre allele) in order to get the knock-out of that gene in the tissue/cells of interest. So in this case, it would be incorrect to have these on two separate annotation lines (MGI).|
|ISS||Mandatory||Protein, Gene identifier or Panther family, InterPro or Pfam identifier, CBS:SignalP, CBS:TargetP, CBS:TMHMM, MetaCyc identifiers, Enzyme Commission Numbers, KEGG or KEGG_PATHWAY identifiers|
|ISO||Mandatory||Protein or Gene identifier|
|ISA||Mandatory?||Protein or Gene identifier|
|IEA||Mandatory?||Protein identifier, External term identifier used in an external2go mapping, InterPro identifier|
Understanding Multiple values in 'with' fields
Interpretation of the pipe ('|') separating values:
- Pipes should only be used with the inferred-from-interaction evidence codes, i.e. IGI and IPI.
- With either IPI or IGI, piped 'with' entries indicate that the evidence is actually a 3-way (or 4-way, or n-ary ...) interaction, not just multiple pairwise interactions.
However, there is currently a wider set of evidence codes using pipes in their 'with' and some differences in their interpretation:
|MOD||evidence code using pipe-separated values||Intended Meaning||Alternative delimiters present?|
|FlyBase||IC, IPI, IGI||Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together||-|
|ZFIN||IC, IPI, ISS, IGI||Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together. In addition, for IGI evidence in binary interactions, there is the ZDB-GENE ID of the interacting partner, and the ZDB-MRPHLNO ID for the morpholino used to target the gene BEING ANNOTATED||-|
|TAIR||IPI, IGI||Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together||-|
|WormBase||IPI, ISS, IMP, IGI||Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together. For IMP annotations, RNAi and Phenotype identifiers are piped together for additional information.||-|
|MGI||IC, IPI, ISO, IMP, IGI||Only display binary interactions for binding, therefore multiple piped values indicate separate binary interactions between the gene product identified in column 2, and directly to each of the ids listed in the with. Therefore it would be correct to reinterprete the file so that each vlaue in the 'with' is displayed on a separate annotation line.||Currently do use commas in some IMP and IGI annotations, however these will shortly be removed.|
|SGD||IC, ISA, IGI||SGD does not currently specify whether multiple ids in the 'with' indicate one-to-many or multiple one-to-one interactions; currently they could mean either.||-|
|dictyBase||IC, IPI, ISS, IGI||Pipe-separated values indicate multiple independent interactions between the gene product identified in column 2, and each of the ids listed in the with. Therefore it would be correct to reinterprete the file so that each vlaue in the 'with' is displayed on a separate annotation line.||-|
|UniProtKB-GOA||none||Currently only one identifier is included in the with field for each annotation (although we are intending to allow multiple values in the second half of 2011). Only binary binding interactions are displayed.||-|
|RGD||IC, IPI, IGI||IPI - piped IDs should be interpreted as displaying multiple separate binary interactions||-|
|PomBase||IC,IGI|| IC - pipe separated GO IDs are used when both GO IDs are required to make the inference
IGI - pipe separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together
IPI - Only binary binding interactions are captured with GO. Pipe separated values are only used with the term "protein binding, bridging"
ISS - Only characterized orthologs, or Protein family identifiers are used. No pipes separated values
|CGD||IPI, ISS, IGI||?||-|
|EcoliWiki||IGI, IPI||represented as multiple binary interactions||-|
|EcoCyc||IGI, IPI||no distinction is made||-|
Out-of-date documentation on this - should be deleted?
N.B. UniProtKB-GOA is updating their display of 'with' field data originating from external MODs.
In future, external annotations that have used the 'with' field will only be included if:
- the evidence code and with field combination is legal (e.g. no with fields for IDA-evidenced annotations, only GO identifiers in IC-evidenced annotations)
- Othe 'with' field value matches a RegExp for the gene/protein/chemical/GO identifier, outlined here
- a single value is included in the 'with' field, or if it known that it is appropriate to 'unwrap' piped values into separate annotation lines, e.g. for MGI or DictyBase (this behaviour will change in summer 2010 for IC, IGI and IMP codes when UniProtKB-GOA has updated its database schema to accept multiple values in the with). For IPI annotations, GOA intends to only displays binary interactions.
QC Checks that check the contents on 'with' fields:
GO_AR:0000003 Annotations to 'protein binding ; GO:0005515', should be made with IPI and interactor should be in the 'with' field
GO_AR:0000004 Reciprocal annotations for protein binding should be made