With field: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 77: Line 77:


[http://geneontology.org/GO.evidence.shtml#withUsage Out-of-date documentation on this] - should be deleted?
[http://geneontology.org/GO.evidence.shtml#withUsage Out-of-date documentation on this] - should be deleted?
QC Checks that check the contents on 'with' fields:
Implemented:
GO_AR:0000003 Annotations to 'protein binding ; GO:0005515', should be made with IPI and interactor should be in the 'with' field
GO_AR:0000004 Reciprocal annotations for protein binding should be made
Proposed:
Hard QC: [http://wiki.geneontology.org/index.php/Annotation_Quality_Control_Checks#All_IC_annotations_should_include_a_GO_id_in_column_8_.28with.29 All IC annotations should include a GO id in column 8 (with)]
Hard QC: [http://wiki.geneontology.org/index.php/Annotation_Quality_Control_Checks#All_IPI_annotations_should_include_a_nucleotide.2Fprotein.2Fchemical_identifier_in_column_8_.28with.29 All IPI annotations should include a nucleotide/protein/chemical identifier in column 8 (with)]
Hard QC: [http://wiki.geneontology.org/index.php/Annotation_Quality_Control_Checks#All_IDA_annotations_should_not_include_an_identifier_in_column_8_.28with.29 All IDA annotations should not include an identifier in column 8 (with)]
Hard QC: [http://wiki.geneontology.org/index.php/Annotation_Quality_Control_Checks#All_identifiers_in_the_GAFs_must_use_the_correct_DB_abbreviation All identifiers in the GAFs must use the correct DB abbreviation]
Soft QC: [http://wiki.geneontology.org/index.php/Annotation_Quality_Control_Checks#All_gene.2Fprotein.2Fchemical_identifiers_used_in_GO_annotations_should_conform_to_RegExps_supplied_in_the_GO.xref.abbs_file All gene/protein/chemical identifiers used in GO annotations should conform to RegExps supplied in the GO.xref.abbs file]

Revision as of 11:34, 28 March 2011

Annotations that use certain evidence codes can contain information in the 'with' field (column 8):

Where an evidence code has not been listed it is assumed that no value should be included in the 'with'.

Evidence Code Mandatory or Optional? Expected values
IC Mandatory GO identifier
IPI Mandatory Protein, Gene, RNA, Chemical identifier
IGI Optional -
IMP Optional Alleles, Morpholinos. In some cases, two alleles are present, but not to the same gene; one is an allele of the gene in question, but is a conditional allele that requires a second allele (usually a Cre allele) in order to get the knock-out of that gene in the tissue/cells of interest. So in this case, it would be incorrect to have these on two separate annotation lines (MGI).
IGC Optional -
RCA ???
ISS Mandatory Protein, Gene identifier or Panther family, InterPro or Pfam identifier, CBS:SignalP, CBS:TargetP, CBS:TMHMM, MetaCyc identifiers, Enzyme Commission Numbers, KEGG or KEGG_PATHWAY identifiers
ISO Mandatory Protein or Gene identifier
ISM Optional -
ISA Mandatory? Protein or Gene identifier
IEA Mandatory? Protein identifier, External term identifier used in an external2go mapping, InterPro identifier

Multiple values in the 'with' field

Interpretation of the pipe ('|') separating values:

From Midori:

- Pipes should only be used with the inferred-from-interaction evidence codes, i.e. IGI and IPI.

- With either IPI or IGI, piped 'with' entries indicate that the evidence is actually a 3-way (or 4-way, or n-ary ...) interaction, not just multiple pairwise interactions.

However, there is currently a wider set of evidence codes using pipes in their 'with' and some differences in their interpretation:

MOD evidence code using pipe-separated values Intended Meaning Alternative delimiters present?
FlyBase IC, IPI, IGI Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together -
ZFIN IC, IPI, ISS, IGI Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together. In addition, for IGI evidence in binary interactions, there is the ZDB-GENE ID of the interacting partner, and the ZDB-MRPHLNO ID for the morpholino used to target the gene BEING ANNOTATED -
TAIR IPI Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together -
WormBase IPI, ISS, IMP, IGI Pipe-separated values indicate a multi-way interaction and so all values displayed in one annotation line should always be grouped together. For IMP annotations, RNAi and Phenotype identifiers are piped together for additional information. -
MGI IC, IPI, ISO, IMP, IGI Only display binary interactions for binding, therefore multiple piped values indicate separate binary interactions between the gene product identified in column 2, and directly to each of the ids listed in the with. Therefore it would be correct to reinterprete the file so that each vlaue in the 'with' is displayed on a separate annotation line. Currently do use commas in some IMP and IGI annotations, however these will shortly be removed.
SGD IC, ISA, IGI ? -
dictyBase IC, IPI, ISS, IGI Pipe-separated values indicate multiple independent interactions between the gene product identified in column 2, and each of the ids listed in the with. Therefore it would be correct to reinterprete the file so that each vlaue in the 'with' is displayed on a separate annotation line. -
UniProtKB-GOA none Currently only one identifier is included in the with field for each annotation (although we are intending to allow multiple values in the second half of 2011). Only binary binding interactions are displayed. -
RGD IC, IPI, IGI ? -
PomBase IC, IPI, ISS,IGI ? -
CGD IPI, ISS, IGI ? -
TIGR ISS, IGI ? -
EcoliWiki IGI, IPI ? -
EcoCyc IGI, IPI ? -
MTB IPI ? -

Out-of-date documentation on this - should be deleted?

QC Checks that check the contents on 'with' fields:

Implemented:

GO_AR:0000003 Annotations to 'protein binding ; GO:0005515', should be made with IPI and interactor should be in the 'with' field

GO_AR:0000004 Reciprocal annotations for protein binding should be made

Proposed: Hard QC: All IC annotations should include a GO id in column 8 (with)

Hard QC: All IPI annotations should include a nucleotide/protein/chemical identifier in column 8 (with)

Hard QC: All IDA annotations should not include an identifier in column 8 (with)

Hard QC: All identifiers in the GAFs must use the correct DB abbreviation

Soft QC: All gene/protein/chemical identifiers used in GO annotations should conform to RegExps supplied in the GO.xref.abbs file