File Description: go-annotation-changes
Usage
This file is used for doing QC on annotations for GO releases.
Input data
Calculated from the go-stats.json files from the current and previous releases.
Format(s)
- json
- tsv. Note that the tsv is for QC checks and is slightly different (usually, fewer tables) compared to json.
File description
The go-annotation-changes
file contains the following information:
summary
current
- release_date: Obtained from
release/metadata/release-date.json
orsnapshot/metadata/release-date.json
. - annotations
- total: Total number of annotations.
- by_aspect: P, F, C.
- by_evidence_cluster: PHYLO, IEA, OTHER, EXP, ND , HTP.
- by_qualifier: contributes_to, colocalizes_with, NOT
- bioentities: Number of bioentities annotated in the GO database.
- taxa: Number of species with annotations.
- taxa_filtered: Number of species with at least 1,000 annotations.
- references: Number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
- pmids: Number of annotated PMIDs.
previous
- Same information as for the current release.
- Note that the date of the previous release is the most recent release in
release.geneontology.org/
, using the filerelease.geneontology.org/YYYY-MM-DD/metadata/release-date.json
.
changes
Differences between the current and the previous release for all the fields above. In addition, for taxa, references and pmids, the number of added and removed items are counted. The 'changes' section contains the following information:
annotations
- total: Change in the total number of annotations.
- by_aspect: Changes in the total number of annotations for each aspect: P, F, C.
- by_evidence_cluster: Changes in the total number of annotations for each evidence cluster (PHYLO, IEA, OTHER, EXP, ND , HTP.).
- by_qualifier: Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.
bioentities
Change in the number of annotated bioentities.
taxa
- total: Changes in the total number of annotated species.
- filtered: Changes in the number of annotated species by filtered taxa.
- added: Number of new species annotated.
- removed: Number of removed species having lost all annotations.
references
- total: Change in the number of annotated references .
- added : Number of newly annotated references (data not yet available).
- removed: Number of references having lost all annotations (data not yet available).
pmids
- total: Change in the number of annotated PMIDs.
- added : Number of newly annotated PMIDs (data not yet available).
- removed: Number of PMIDs having lost all annotations (data not yet available).
detailed_changes
All data in this section is shown as x/y, where x is the difference in the total number of annotations, y is the total number of annotations in the current release, and the % change in shown in parentheses.
annotations
- total: Change in the total number of annotations.
- by_aspect: Change in the total number of annotation for each aspect: P, F, C.
- by_bioentity_type
- all : Change in the number of annotations for each bioentity type.
- cluster: Change in the number of annotations by bioentity type cluster. Not shown in tsv version.
- by_qualifier: Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.
- by_taxon: Change in the number of annotations by species.
- by_evidence
- all
- by_evidence_cluster. Not shown in tsv version.
- by_model_organism
- by_evidence: For each species, the number of annotations are shown for each individual evidence code, detailed by aspect.
- by_evidence cluster: For each species, the number of annotations are shown for each evidence cluster, detailed by aspect. Not shown in tsv version.
- by_qualifier: contributes_to, colocalizes_with, NOT
- by_group: Changes in the total number of annotations for each contributing group, obtained using the 'assigned_by' field.
taxa
- added: List of added species with current number of annotations.
- removed: List of removed species with previous number of annotations.
bioentities
- total: The difference in the total number of annotated bioentities.
- by_type:
- all: Difference in the number of annotated bioentities.
- cluster: Difference in the number of annotated bioentities, grouped by bioentity type clusters.
- by_filtered_taxon: Difference in the number of annotated bioentities for each filtered species, detailed by aspect (A, P, F, C).
- all: Difference in the number of annotated bioentities for each species.
- cluster: Difference in the number of annotated bioentities grouped by bioentity clusters, for each species.
references
- all
- total: Change in annotated references.
- by_filtered_taxon: Change in annotated references by species.
- by_group: Change in annotated references for each contributing group, obtained using the 'assigned_by' field.
- pmids
- total: Change in annotated PMIDs.
- by_filtered_taxon: Change in annotated PMIDs by species.
- by_group: Change in annotated PMIDs for each contributing group, obtained using the 'assigned_by' field.
Direct access to files
snapshot
- http://snapshot.geneontology.org/release_stats/go-annotation-changes.json
- http://snapshot.geneontology.org/release_stats/go-annotation-changes.tsv
current
- http://current.geneontology.org/release_stats/go-annotation-changes.json
- http://current.geneontology.org/release_stats/go-annotation-changes.tsv
Review Status
Last reviewed: March 5, 2020