GO Annotation Basics

From GO Wiki
Jump to: navigation, search

GO annotation guidelines

1. Generalities

1.1 Semantics of a GO annotation The association of a GO class with a gene product is a statement that means:

  • molecular function: molecular activities of gene products
  • cellular component: where gene products are active
  • biological process: pathways and larger processes made up of the activities of multiple gene products.

In other words, annotations represent the normal, in vivo biological role of gene products.

1.2 Tips to produce high quality annotations

1.2.1 Get the wider perspective

  • Favor a gene-by-gene or pathway-by-pathway approach for curation rather than paper-by-paper
  • Read recent publications
  • Look at existing annotations for the same protein and for the term you have chosen to annotate, to ensure consistency. This may trigger the need to revise other annotations
  • Remove incorrect annotations based on invalidated hypothesis

1.2.2 Focus on the research hypothesis Use prior knowledge to understand the hypothesis being tested and its relation to the experimental observation.

1.2.3 Capture the conclusion, not the assay

1.2.4 Infering normal functions based on phenotypes Phenotypes can help understand the function of proteins, but also provide insights into mechanisms leading to disease The scope of the GO, though, is to capture the normal function of proteins, so special care must be taken to understand how a phenotypic observation helps understand the nomal function of a protein.

Beware of indirect effects of mutations. - Housekeeping genes, such as RNA polymerase, affects essentially all cellular processes (cell proliferation, development, etc) but does not *mediate* these processes.

Phenotypes not supported by a molecular role for the protein. - Knockout/knock downs may result in pleiotropic effects on cell biology, development, etc. Without understanding the molecular mechanis, be careful no to make annotations to Biological Processes terms that are more specific than the experiment allows to conclude.

1.3 GO aims to represent current knowledge The GO is not an archive or all findings published on a protein. Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role.