File Description: go-annotation-changes

From GO Wiki
Jump to navigation Jump to search

Usage

This file is used for doing QC on annotations for GO releases.

Input data

Calculated from the go-stats.json files from the current and previous releases.

Format(s)

  • json
  • tsv

File description

The go-annotation-changes file contains the following information:

summary

current

  • release_date: Obtained from release/metadata/release-date.json or snapshot/metadata/release-date.json.
  • annotations
    • total: Total number of annotations.
    • by_aspect: P, F, C.
    • by_evidence_cluster: PHYLO, IEA, OTHER, EXP, ND
, HTP.
    • by_qualifier: contributes_to, colocalizes_with, NOT
  • bioentities: Number of bioentities annotated in the GO database.
  • taxa: Number of species with annotations.
  • taxa_filtered: Number of species with at least 1,000 annotations.
  • references: Number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
  • pmids: Number of annotated PMIDs.

previous

  • Same information as for the current release.
  • Note that the date of the previous release is the most recent release in release.geneontology.org/, using the file release.geneontology.org/YYYY-MM-DD/metadata/release-date.json.

changes

Differences between the current and the previous release for all the fields above. In addition, for taxa, references and pmids, the number of added and removed items are counted.

detailed changes

annotations

  • total: Change in the total number of annotations.
  • by_aspect: Changes in the total number of annotations for each aspect: P, F, C.
  • by_evidence_cluster: Changes in the total number of annotations for each evidence cluster (PHYLO, IEA, OTHER, EXP, ND
, HTP.).

bioentities

Change in the number of annotated bioentities.

taxa

  • total: Changes in the total number of annotated species.
  • filtered: Changes in the number of annotated species by filtered taxa.
  • added: Number of new species annotated.
  • removed: Number of removed species having lost all annotations.

references

  • total: Change in the number of annotated references .
  • added : Number of newly annotated references (data not yet available).
  • removed: Number of references having lost all annotations (data not yet available).

pmids

  • total: Change in the number of annotated PMIDs.
  • added : Number of newly annotated PMIDs (data not yet available).
  • removed: Number of PMIDs having lost all annotations (data not yet available).

detailed_changes

All data in this section is shown as x/y, where x is the difference in the total number of annotations, y is the total number of annotations in the current release, and the % change in shown in parentheses.

annotations

taxa

  • added: List of added species with current number of annotations.
  • removed: List of removed species with previous number of annotations.

bioentities

  • total: The difference in the total number of annotated bioentities.
  • by_type:
  • by_filtered_taxon: Difference in the number of annotated bioentities for each filtered species, detailed by aspect (A, P, F, C).
    • all: Difference in the number of annotated bioentities for each species.
    • cluster: Difference in the number of annotated bioentities grouped by bioentity clusters, for each species.

references

  • all
    • total Change in annotated references.
    • by_filtered_taxon: Change in annotated references by species.
    • by_group: Change in annotated references for each contributing group, obtained using the 'assigned_by' field..
  • pmids
    • total Change in annotated PMIDs.
    • by_filtered_taxon: Change in annotated PMIDs by species.
    • by_group: Change in annotated PMIDs for each contributing group, obtained using the 'assigned_by' field..

Direct access to files

snapshot

current

Review Status

Last reviewed: October 24, 2019