Tips to Produce High Quality Annotations

From GO Wiki
Jump to: navigation, search

Tips to Produce High Quality Annotations

 See also

Get the Wider Perspective

  • Favor a gene-by-gene or pathway-by-pathway approach for curation rather than paper-by-paper
  • Read recent publications
  • Look at existing annotations for the same protein and for the term you have chosen to annotate, to ensure consistency. This may trigger the need to revise other annotations
  • Remove incorrect annotations based on invalidated hypothesis

Focus on the Research Hypothesis

Use prior knowledge to understand the hypothesis being tested and its relation to the experimental observation.

Capture the Conclusion, not the Assay

Use Caution when Inferring Normal Functions Based on Phenotypes

Phenotypes can help understand the function of proteins, but also provide insights into mechanisms leading to disease The scope of the GO, though, is to capture the normal function of proteins, so special care must be taken to understand how a phenotypic observation helps understand the nomal function of a protein.

Beware of indirect effects of mutations. - Housekeeping genes, such as RNA polymerase, affects essentially all cellular processes (cell proliferation, development, etc) but does not *mediate* these processes.

Phenotypes not supported by a molecular role for the protein. - Knockout/knock downs may result in pleiotropic effects on cell biology, development, etc. Without understanding the molecular mechanis, be careful no to make annotations to Biological Processes terms that are more specific than the experiment allows to conclude.

Check Term Placement in the Ontology

Represent Current Knowledge

The GO is not an archive or all findings published on a protein. Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role.

Avoiding predatory journals

Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: If you have a doubt this may hep decide whether or not to annotate the paper.

More information about predatory journals may be found in these articles:


Annotations from large-scale datasets

If you work on a previously unannotated organism, or your research group has a specific research expertise that could be used to produce GO annotations:

  • [Contact the GOC]( to discuss the best approach for your annotations and to ensure you are the only group working on your organism. If you would be interested in taking ownership for an organism with outdated annotations, we can help you find the right people to contact as well.
  • Training of new curators will be arranged, if needed, with an existing GOC mentor.
  • A representative of your group will need to [join GitHub](/docs/how-to-submit-requests/) in order to maintain your group's annotations. Once a representative is designated, the GOC will also generate internal files needed to submit your annotations to GO.

Not enough annotations to justify joining GO?

  • Submit one or just a few manual annotations by adding a new issue on the [GOC GitHub Annotation Tracker]( Each of your annotations should include at least one key literature reference (PMID) in support of your assertions. Please state whether or not regular updates will be submitted about this annotation.

Automated Annotations

If your group is interested in generating a large number of automated/electronic annotations, please be aware that InterPro2GO is the only source of [IEAs, Inferred from Electronic Annotation]( recognized by the GOC. Submit your transcripts or other data to UniProt, and they will automatically generate IEAs from your data. Once your organism is in UniProt, [contact the GOC]( and we will gladly assist in curator training so your group can add manual annotations as well.

Reviewing GO annotations associated with a scientific article

Literature annotation involves capturing published information about the exact function of a gene product as a GO annotations. This curation process is time-consuming but produces very high quality, species-specific annotation; the accuracy and uniform format of annotations allows the information to be used in high-throughput experiments. GO curation may be best carried out by people who know the function of the gene product and the associated biology in great detail- for example, experimental scientists who are familiar with the published literature. If you are an expert in a gene product or a particular field, then you may like to [suggest modifications to the ontology structure](/docs/contributing-to-go-terms/) as well.

Below is a schematic diagram giving an introduction to the steps involved in literature-based GO annotation.

To begin, check if there are existing annotations to the paper: open a Gene Ontology browser, (e.g. [AmiGO](, [QuickGO]( and enter a PubMed identifier (PMID) for the paper of interest in the 'Search' field.

If GO annotations are listed in the results:

  1. Check whether the paper has been annotated by GO curators.
  2. Click on the PMID and browse annotations associated with the paper.
    • If you agree that the annotations accurately represent the data, you are done!
    • If you think the annotations could be improved: Write a new issue on the 'GOC GitHub Annotation Tracker', indicating that these annotations should be reviewed. Include:
      • a PMID
      • the name of the species investigated in the experiment that led to this publication
      • Please state whether or not regular updates will be submitted about this annotation.

If no results are listed using this PMID:

This means the paper has not been annotated by GO curators.

  • Write a new issue on the 'GOC GitHub Annotation Tracker', indicating that this is a new annotation. Include:
    • a PMID
    • the name of the species investigated in the experiment that led to this publication
    • Please state whether or not regular updates will be submitted about this annotation.

Reviewing GO annotations for a gene or protein:

To start, check if there are existing annotations to the gene or protein of interest: open a Gene Ontology browser (e.g. AmiGO, QuickGO) and search for the gene or gene protein record of interest by entering it in the 'Search' field, then browse associated annotations and follow links to see the full list of annotations:

Back to: Annotation