Tips to Produce High Quality Annotations

From GO Wiki
Jump to navigation Jump to search

Get the wider perspective

  • Favor a pathway-by-pathway or gene product-by-gene product approach for curation, rather than paper-by-paper.
    • GO annotations should be made considering the overall context in which a gene product functions, which may not be apparent from just looking at individual papers
  • Read recent publications, including recent reviews if available.
    • Recent reviews can also be helpful for understanding the overall pathway or process you are annotating.
  • Look at existing annotations for the gene product and for the term you have chosen to annotate, to ensure consistency.
    • This may trigger the need to revise other annotations.
  • Remove incorrect annotations based on invalidated hypotheses.
    • If potentially incorrect annotations have been made by another group, create an issue in the go-annotation tracker.

Focus on the research hypothesis

  • Use existing knowledge to understand the hypothesis being tested and its relation to the experimental observation.

Capture the conclusion, not the assay

  • Some assays can readily seem to correspond to GO terms.
  • Be careful when interpreting the results of these assays as they may not reflect the actual role of the gene product in a process.
    • Examples are:
      • 'GO:0006309 apoptotic DNA fragmentation'
      • caspase activation, corresponding to the GO term 'GO:0006919 activation of cysteine-type endopeptidase activity involved in apoptotic process'.
        • These assays are often used to measure whether apoptosis occurred, but NOT to measure a specific, direct role of a gene product in that process.
      • wound healing assay, which is an assay for 'GO:0048870' cell motility', and NOT GO:0042060 wound healing.

Annotate gene products to their normal function

  • The scope of the GO is to capture the normal function of gene products, so special care must be taken to understand how experimental observations inform this.
  • Mutant phenotypes are an important sources of GO annotations, but curators should consider carefully how to use that information for annotation.
  • Specific guidelines for annotating mutant phenotypes are available here.

Check term placement in the ontology

Make sure that the parents of the term are also consistent with the annotation. For example 'GO:0015616 DNA translocase activity' is a 'GO:0008094 DNA-dependent ATPase activity', although just by looking at the term name one cannot tell that DNA translocase is a type of ATPase.

Represent current knowledge

The GO is not an archive or all findings published on a protein. Do no hesitate to remove older annotations that are inconsistent with the current state of knowledge for a protein' role.

Annotations from large-scale datasets

The term high-throughput data is often used to describe data that has been generated by automatic or semi-automatic methodology without validation of the results for individual gene products. The experiments can be viewed as screens: experiments performed in parallel without explicit target selection; they are generally not hypothesis-driven. More details as to how to annotate high throughput data can be found in the [Inferred from High_Throughput Experiment (HTP)] page.

Avoiding predatory journals

Make sure you select high quality papers. Some journals have been labeled 'Predatory journals' for their dubious practices with respect to the publication process. There is a list of these journals here: https://predatoryjournals.com/journals/. If you have a doubt this may hep decide whether or not to annotate the paper.

More information about predatory journals may be found in these articles:

Review Status

Last reviewed: April 11, 2019


Back to: Annotation