Tips to Produce High Quality Annotations: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
Line 18: Line 18:


== Capture the Conclusion, not the Assay ==
== Capture the Conclusion, not the Assay ==
Some assays can also be GO terms. Examples are 'apoptotic DNA fragmentation' (GO:0006309) and caspase activation, corresponding to the GO term ' GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process'.  
* Some assays can readily seem to correspond to GO terms.
 
* Be careful when interpreting the results of these assays as they may not reflect the actual role of the gene product in a process.
These assays are often used to measure whether apoptosis occurred, but NOT to measure these processes directly. If the protein is not mediating these processes, consider annotating to  GO:0006915 apoptotic process or to 'GO:0042981 regulation of apoptotic process'.
** Examples are:
*** 'apoptotic DNA fragmentation' (GO:0006309)  
*** caspase activation, corresponding to the GO term ' GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process'.  
** These assays are often used to measure whether apoptosis occurred, but NOT to measure a specific, direct role of a gene product in that process.  
** If the molecular function of the gene product is not directly involved in these processes, consider annotating to  GO:0006915 apoptotic process or to 'GO:0042981 regulation of apoptotic process', or using the http://wiki.geneontology.org/index.php/Acts_upstream_of_or_within 'acts upstream of or within'] gene product-to-GO term relation (or the positive or negative effect child terms) to associate the gene product with the GO process term.


== Use Caution when Inferring Normal Functions Based on Phenotypes ==
== Use Caution when Inferring Normal Functions Based on Phenotypes ==

Revision as of 11:47, 5 March 2019

Tips to Produce High Quality Annotations

 See also http://wiki.geneontology.org/index.php/Annotation_conventions

Key considerations

Get the Wider Perspective

  • Favor a pathway-by-pathway or gene product-by-gene product approach for curation rather than paper-by-paper
    • GO annotations should be made considering the overall context in which a gene product functions, which may not be apparent from just looking at individual papers
  • Read recent publications
    • Recent reviews can also be helpful for understanding the overall pathway or process you are annotating
  • Look at existing annotations for the gene product and for the term you have chosen to annotate, to ensure consistency.
    • This may trigger the need to revise other annotations.
  • Remove incorrect annotations based on invalidated hypotheses.
    • If potentially incorrect annotations have been made by another group, create an issue in the go-annotation tracker.

Focus on the Research Hypothesis

  • Use existing knowledge to understand the hypothesis being tested and its relation to the experimental observation.

Capture the Conclusion, not the Assay

  • Some assays can readily seem to correspond to GO terms.
  • Be careful when interpreting the results of these assays as they may not reflect the actual role of the gene product in a process.
    • Examples are:
      • 'apoptotic DNA fragmentation' (GO:0006309)
      • caspase activation, corresponding to the GO term ' GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process'.
    • These assays are often used to measure whether apoptosis occurred, but NOT to measure a specific, direct role of a gene product in that process.
    • If the molecular function of the gene product is not directly involved in these processes, consider annotating to GO:0006915 apoptotic process or to 'GO:0042981 regulation of apoptotic process', or using the http://wiki.geneontology.org/index.php/Acts_upstream_of_or_within 'acts upstream of or within'] gene product-to-GO term relation (or the positive or negative effect child terms) to associate the gene product with the GO process term.

Use Caution when Inferring Normal Functions Based on Phenotypes

Phenotypes can help understand the function of proteins, but also provide insights into mechanisms leading to disease The scope of the GO, though, is to capture the normal function of proteins, so special care must be taken to understand how a phenotypic observation helps understand the nomal function of a protein.

Beware of indirect effects of mutations. - Housekeeping genes, such as RNA polymerase, affects essentially all cellular processes (cell proliferation, development, etc) but does not *mediate* these processes.

Phenotypes not supported by a molecular role for the protein. - Knockout/knock downs may result in pleiotropic effects on cell biology, development, etc. Without understanding the molecular mechanis, be careful no to make annotations to Biological Processes terms that are more specific than the experiment allows to conclude.

Check Term Placement in the Ontology

Make sure that the parents of the term are also consistent with the annotation. For example 'GO:0015616 DNA translocase activity' is a 'GO:0008094 DNA-dependent ATPase activity', although just by looking at the term name one cannot tell that DNA translocase is a type of ATPase.

Represent Current Knowledge

The GO is not an archive or all findings published on a protein. Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role.

Other important points to consider

Avoiding predatory journals

Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: https://predatoryjournals.com/journals/. If you have a doubt this may hep decide whether or not to annotate the paper.

More information about predatory journals may be found in these articles:


  From http://geneontology.org/page/annotations-from-article
  http://geneontology.org/page/annotations-gene-prot
  http://geneontology.org/contribute-large-dataset

Annotations from large-scale datasets

If you work on a previously unannotated organism, or your research group has a specific research expertise that could be used to produce GO annotations:

  • [Contact the GOC](http://help.geneontology.org/) to discuss the best approach for your annotations and to ensure you are the only group working on your organism. If you would be interested in taking ownership for an organism with outdated annotations, we can help you find the right people to contact as well.
  • Training of new curators will be arranged, if needed, with an existing GOC mentor.
  • A representative of your group will need to [join GitHub](/docs/how-to-submit-requests/) in order to maintain your group's annotations. Once a representative is designated, the GOC will also generate internal files needed to submit your annotations to GO.

Not enough annotations to justify joining GO?

  • Submit one or just a few manual annotations by adding a new issue on the [GOC GitHub Annotation Tracker](https://github.com/geneontology/go-annotation/issues). Each of your annotations should include at least one key literature reference (PMID) in support of your assertions. Please state whether or not regular updates will be submitted about this annotation.

Automated Annotations

If your group is interested in generating a large number of automated/electronic annotations, please be aware that InterPro2GO is the only source of [IEAs, Inferred from Electronic Annotation](http://wiki.geneontology.org/index.php/Inferred_from_Electronic_Annotation_(IEA)) recognized by the GOC. Submit your transcripts or other data to UniProt, and they will automatically generate IEAs from your data. Once your organism is in UniProt, [contact the GOC](http://help.geneontology.org/) and we will gladly assist in curator training so your group can add manual annotations as well.

Reviewing GO annotations associated with a scientific article

Literature annotation involves capturing published information about the exact function of a gene product as a GO annotations. This curation process is time-consuming but produces very high quality, species-specific annotation; the accuracy and uniform format of annotations allows the information to be used in high-throughput experiments. GO curation may be best carried out by people who know the function of the gene product and the associated biology in great detail- for example, experimental scientists who are familiar with the published literature. If you are an expert in a gene product or a particular field, then you may like to [suggest modifications to the ontology structure](/docs/contributing-to-go-terms/) as well.

Below is a schematic diagram giving an introduction to the steps involved in literature-based GO annotation. http://geneontology.org/sites/default/files/public/diag-literature-annot.png

To begin, check if there are existing annotations to the paper: open a Gene Ontology browser, (e.g. [AmiGO](http://amigo.geneontology.org/amigo), [QuickGO](https://www.ebi.ac.uk/QuickGO/)) and enter a PubMed identifier (PMID) for the paper of interest in the 'Search' field.

If GO annotations are listed in the results:

  1. Check whether the paper has been annotated by GO curators.
  2. Click on the PMID and browse annotations associated with the paper.
    • If you agree that the annotations accurately represent the data, you are done!
    • If you think the annotations could be improved: Write a new issue on the 'GOC GitHub Annotation Tracker', indicating that these annotations should be reviewed. Include:
      • a PMID
      • the name of the species investigated in the experiment that led to this publication
      • Please state whether or not regular updates will be submitted about this annotation.

If no results are listed using this PMID:

This means the paper has not been annotated by GO curators.

  • Write a new issue on the 'GOC GitHub Annotation Tracker', indicating that this is a new annotation. Include:
    • a PMID
    • the name of the species investigated in the experiment that led to this publication
    • Please state whether or not regular updates will be submitted about this annotation.

Reviewing GO annotations for a gene or protein:

To start, check if there are existing annotations to the gene or protein of interest: open a Gene Ontology browser (e.g. AmiGO, QuickGO) and search for the gene or gene protein record of interest by entering it in the 'Search' field, then browse associated annotations and follow links to see the full list of annotations:


Back to: Annotation