New documentation for IKR: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 10: Line 10:
A type of evidence derived from sequence analysis, characterized by the lack of key sequence residues. All annotations that apply this evidence code should use the 'NOT' qualifier.
A type of evidence derived from sequence analysis, characterized by the lack of key sequence residues. All annotations that apply this evidence code should use the 'NOT' qualifier.


This evidence code is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential catalytic residues and is very unlikely to have enzymatic activity, and therefore is unlikely to carry out the associated function or participate in the expected associated process.  
This evidence code is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential residues and is very unlikely to be able to carry out an associated function, participate in the expected associated process, or found in a certain location.  


This annotation statement can be supported by a published literature reference (e.g. a PubMed identifier) that has described the sequence analysis efforts, or by a [http://www.geneontology.org/cgi-bin/references.cgi GO Reference] that describes the process a curator undertook to become sufficiently convinced of the sequence mutation.  
This annotation statement can be supported by a published literature reference (e.g. a PubMed identifier) that has described the sequence analysis efforts, or by a [http://www.geneontology.org/cgi-bin/references.cgi GO Reference] that describes the process a curator undertook to become sufficiently convinced of the sequence mutation.  
Line 49: Line 49:
'''Examples where the IKR evidence code should not be used:'''
'''Examples where the IKR evidence code should not be used:'''


* If there is experimental evidence available from a publication to support a NOT-evidenced annotation. In such instances, the curator should make the IDA, IMP or EXP NOT-qualified annotation based on the experimental evidence.
* If there is experimental evidence available from a publication to support a NOT-evidenced annotation. In such instances, the curator should make the IDA, IMP or EXP NOT-qualified annotation based on the experimental evidence. If a paper supplies data that showed the active site was missing and additionally carried out an experimental assay to show lack of activity, it would be correct to create two annotation statements from this paper; both NOT IKR and NOT IDA.  


CAUTION: Where curators make judgements of functionning using the IKR evidence code, they should be able to draw on some level of expertise regarding the protein family, as there will always be exceptions to the rule. For insteance, Q9H4A3 (WNK1_HUMAN) is a good example where nature has confounded prediction; Cys-250 is present instead of the conserved Lys which is expected to be an active site residue. However Lys-233 appears to fulfill the required catalytic function.
CAUTION: Where curators make judgements of functionning using the IKR evidence code, they should be able to draw on some level of expertise regarding the protein family, as there will always be exceptions to the rule. For instance, Q9H4A3 (WNK1_HUMAN) is a good example where nature has confounded prediction; Cys-250 is present instead of the conserved Lys which is expected to be an active site residue. However Lys-233 appears to fulfill the required catalytic function.


==New GO_REF draft to be used with IKR-evidenced annotations==
==New GO_REF draft to be used with IKR-evidenced annotations==

Revision as of 07:18, 12 April 2012

New Documentation for IKR: Inferred from Key Residues

Current IKR documentation is located here

  • The existing documentation is quite concise/limited, and currently the evidence code is only applied by the PAINT curators. Could we improve the documentation, and perhaps create a GO_REF to allow more use of this specific evidence?

IKR: Inferred from Key Residues

Note: This evidence code was previously referred to as IMR (inferred from Missing Residues).

A type of evidence derived from sequence analysis, characterized by the lack of key sequence residues. All annotations that apply this evidence code should use the 'NOT' qualifier.

This evidence code is used to annotate a gene product when, although homologous to a particular protein family, it has lost essential residues and is very unlikely to be able to carry out an associated function, participate in the expected associated process, or found in a certain location.

This annotation statement can be supported by a published literature reference (e.g. a PubMed identifier) that has described the sequence analysis efforts, or by a GO Reference that describes the process a curator undertook to become sufficiently convinced of the sequence mutation.

Where an IKR annotation statement is made using a GO Reference, inclusion of an identifier in the 'with/from' column of the annotation format that can indicate to the user the lacking residues (e.g. an alignment, domain or annotation rule identifier) is absolutely required. In contrast, no value is required in the 'with/from' field where the annotation is supported by a published literature reference.

Examples where the IKR evidence code should be used:

  • 1. Curator-Determined IKR Annotation Example

Rat HPT (P06866) is homologous to serine proteases and contains a match to the peptidase S1 domain. However further sequence analysis by a curator looking at the Peptidase S1B, active site, established it has lost all essential catalytic residues, making it unable to carry out serine protease activity.

Annotation:

UniProtKB:P06866 NOT GO:0004252 serine-type endopeptidase activity IKR GO_REF:0000XXX InterPro:IPR000126 UniProtKB

  • 3. Curator-Determined IKR Annotation Example, Using PAINT

Curators determined that Drosophila neuroligin protein does not have carboxylesterase activity, based on phylogeny-based evidence. The Panther identifier in the 'with/from' field links out to an evidence record citing annotation data from orthologous gene products, supporting the annotation statement.

Annotation:

Q9NGK5 neuroligin NOT GO:0004091 carboxylesterase activity IKR GO_REF:0000033 PANTHER:PTHR11559_AN146 RefGenome

  • 3. Paper-Curated IKR Annotation Example

Ross,J., Jiang,H., Kanost,M.R. and Wang,Y. (2003) Serine proteases and their homologs in the Drosophila melanogaster genome: an initial analysis of sequence conservation and phylogenetic relationships. Gene 30;304:117-31. PMID:12568721.

The authors describe the determination of serine protease activity of proteins from the D. melanogaster S1 serine protease gene family, by determining the presence of conserved His, Asp, Ser catalytic triad residues in retrieved sequences. If all three residues were present in the conserved TAAHC, DIAL, and GDSGGP motifs, the sequence was considered to have serine protease activity. Any sequence lacking one of the key residues was identified as an a serine protease homolog, lacking proteolytic activity.

Annotation:

FB:FBgn0033192 NOT GO:0004252 serine-type endopeptidase activity IKR PMID:12568721 FlyBase


Examples where the IKR evidence code should not be used:

  • If there is experimental evidence available from a publication to support a NOT-evidenced annotation. In such instances, the curator should make the IDA, IMP or EXP NOT-qualified annotation based on the experimental evidence. If a paper supplies data that showed the active site was missing and additionally carried out an experimental assay to show lack of activity, it would be correct to create two annotation statements from this paper; both NOT IKR and NOT IDA.

CAUTION: Where curators make judgements of functionning using the IKR evidence code, they should be able to draw on some level of expertise regarding the protein family, as there will always be exceptions to the rule. For instance, Q9H4A3 (WNK1_HUMAN) is a good example where nature has confounded prediction; Cys-250 is present instead of the conserved Lys which is expected to be an active site residue. However Lys-233 appears to fulfill the required catalytic function.

New GO_REF draft to be used with IKR-evidenced annotations

GO_REF:0000XXX

This describes a method for supplying a NOT-qualified, IKR-evidenced GO annotation to a gene product, when general sequence homology considerations would suggest a function or location, or a role in a biological process, but where a curator has determined that the absence of key sequence residues, known to be required for an expected activity or location, indicating the gene product is unlikely to be able to carry out the implied activity, involvement in a process or cellular component location.

This reference should only be used used when an IKR-evidenced annotation is made based on curator judgement from manually reviewing the sequence of the gene product and where no publication can be found to support the curators conclusion. It is preferable to cite a peer-reviewed publication (such as a PubMed identifier) for IKR-evidenced annotations whenever possible.

Curators will have carefully reviewed the sequence of the annotated protein, and established that the key residues known to be required for an expected activity or location are not present.

Inclusion of an identifier in the 'with/from' field, that highlights to the user the lacking residues(e.g. an alignment, domain or rule identifier) is absolutely required when annotating to IKR with this GO_REF.

For more information on the IKR evidence code, please see documentation here