New documentation for IKR

From GO Wiki
Jump to navigation Jump to search

New Documentation for IKR: Inferred from Key Residues

Current IKR documentation is located here

  • The existing documentation is quite concise/limited, and currently the evidence code is only applied by the PAINT curators. Could we improve the documentation, and perhaps create a GO_REF to allow more use of this specific evidence?

IKR: Inferred from Key Residues

Note: This evidence code was previously referred to as IMR (inferred from Missing Residues).

A type of evidence derived from sequence analysis, characterized by the lack of key sequence residues. All annotations that apply this evidence code should use the 'NOT' qualifier.

This evidence code is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential catalytic residues and is very unlikely to have enzymatic activity, and therefore is unlikely to carry out the associated function or participate in the expected associated process.

This annotation statement can be supported by a published literature reference (e.g. a PubMed identifier) that has described the sequence analysis efforts, or by a GO Reference that describes the process a curator undertook to become sufficiently convinced of the sequence mutation.

Where an IKR annotation statement is made using a GO Reference, inclusion of an identifier in the 'with/from' column of the annotation format that can indicate to the user the lacking residues (e.g. an alignment, domain or annotation rule identifier) is absolutely required. In contrast, no value is required in the 'with/from' field where the annotation is supported by a published literature reference.

Examples where the IKR evidence code should be used:

  • Rat HPT (P06866) is homologous to serine proteases and contains a match to the peptidase S1 domain. However further sequence analysis by a curator looking at the Peptidase S1B, active site, established it has lost all essential catalytic residues, making it unable to carry out serine protease activity.

Recommended Annotation:

UniProtKB:P06866 NOT GO:0004252 serine-type endopeptidase activity IKR GO_REF:0000XXX InterPro:IPR000126 UniProtKB

  • Ross,J., Jiang,H., Kanost,M.R. and Wang,Y. (2003) Serine proteases and their homologs in the Drosophila melanogaster genome: an initial analysis of sequence conservation and phylogenetic relationships. Gene 30;304:117-31. (http://www.ncbi.nlm.nih.gov/pubmed?term=12568721 PMID:12568721).

The authors describe the determination of serine protease activity of proteins from the D. melanogaster S1 serine protease gene family, by determining the presence of conserved His, Asp, Ser catalytic triad residues in retrieved sequences. If all three residues were present in the conserved TAAHC, DIAL, and GDSGGP motifs, the sequence was considered to have serine protease activity. Any sequence lacking one of the key residues was identified as an a serine protease homolog, lacking proteolytic activity.

Recommended Annotation:

FB:FBgn0033192 NOT GO:0004252 serine-type endopeptidase activity IKR PMID:12568721 FlyBase

Examples where the IKR evidence code should not be used:

  • If there is experimental evidence available from a publication to support a NOT-evidenced annotation. In such instances, the curator should make the IDA, IMP or EXP NOT-qualified annotation based on the experimental evidence.

CAUTION: Where curators make judgements of functionning using the IKR evidence code, they should be able to draw on some level of expertise regarding the protein family, as there will always be exceptions to the rule. For insteance, Q9H4A3 (WNK1_HUMAN) is a good example where nature has confounded prediction; Cys-250 is present instead of the conserved Lys which is expected to be an active site residue. However Lys-233 appears to fulfill the required catalytic function.

New GO_REF draft to be used with IKR-evidenced annotations

GO_REF:0000XXX

This describes a method for supplying a NOT-qualified, IKR-evidenced GO annotation to a gene product, when general sequence homology considerations would suggest a function or location, or a role in a biological process, but where a curator has determined that the absence of key sequence residues, known to be required for an expected activity or location, indicating the gene product is unlikely to be able to carry out the implied activity, involvement in a process or cellular component location.

This reference should only be used used when an IKR-evidenced annotation is made based on curator judgement from manually reviewing the sequence of the gene product and where no publication can be found to support the curators conclusion. It is preferable to cite a peer-reviewed publication (such as a PubMed identifier) for IKR-evidenced annotations whenever possible.

Curators will have carefully reviewed the sequence of the annotated protein, and established that the key residues known to be required for an expected activity or location are not present.

Inclusion of an identifier in the 'with/from' field, that highlights to the user the lacking residues(e.g. an alignment, domain or rule identifier) is absolutely required when annotating to IKR with this GO_REF.

For more information on the IKR evidence code, please see documentation here