File Description: go-annotation-changes: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
 
(13 intermediate revisions by the same user not shown)
Line 7: Line 7:
=Format(s)=
=Format(s)=
* json
* json
* tsv
* tsv. Note that the tsv is for QC checks and is slightly different (usually, fewer tables) compared to json.


=File description=
=File description=
Line 39: Line 39:
* '''[[GO_stats-glossary#aspect |by_aspect]]:''' Changes in the total number of annotations for each aspect: P, F, C.
* '''[[GO_stats-glossary#aspect |by_aspect]]:''' Changes in the total number of annotations for each aspect: P, F, C.
* '''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]''': Changes in the total number of annotations for each [[GO_stats-glossary#evidence_cluster| evidence cluster]] (PHYLO, IEA, OTHER, EXP, ND
, HTP.).
* '''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]''': Changes in the total number of annotations for each [[GO_stats-glossary#evidence_cluster| evidence cluster]] (PHYLO, IEA, OTHER, EXP, ND
, HTP.).
* '''by_qualifier:''' contributes_to, colocalizes_with, NOT
* '''by_qualifier:''' Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.


====[[GO_stats-glossary#bioentity_type |bioentities]]====
====[[GO_stats-glossary#bioentity_type |bioentities]]====
Line 69: Line 69:
* '''[[GO_stats-glossary#bioentity_type| by_bioentity_type]]'''
* '''[[GO_stats-glossary#bioentity_type| by_bioentity_type]]'''
**'''[[GO_stats-glossary#bioentity_type| all]] :''' Change in the number of annotations for each [[GO_stats-glossary#bioentity_type| bioentity type]].  
**'''[[GO_stats-glossary#bioentity_type| all]] :''' Change in the number of annotations for each [[GO_stats-glossary#bioentity_type| bioentity type]].  
**'''[[GO_stats-glossary#bioentity_type_cluster| cluster]]:''' Change in the number of annotations by [[GO_stats-glossary#bioentity_type_cluster| bioentity type cluster]].  
**'''[[GO_stats-glossary#bioentity_type_cluster| cluster]]:''' Change in the number of annotations by [[GO_stats-glossary#bioentity_type_cluster| bioentity type cluster]]. '''Not shown in tsv version.'''
* '''by_qualifier:''' contributes_to, colocalizes_with, NOT  
* '''by_qualifier:''' Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.
*'''by_taxon''': Change in the number of annotations by species.  
*'''by_taxon''': Change in the number of annotations by species.  
*'''by_evidence'''
*'''by_evidence'''
**'''[http://geneontology.org/docs/guide-go-evidence-codes all]'''
**'''[http://geneontology.org/docs/guide-go-evidence-codes all]'''
**'''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]'''
**'''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]].''' '''Not shown in tsv version.'''
*'''[[GO_stats-glossary#model_organism |by_model_organism]]'''
*'''[[GO_stats-glossary#model_organism |by_model_organism]]'''
**'''[http://geneontology.org/docs/guide-go-evidence-codes by_evidence]:''' For each species, the number of annotations are shown for each individual evidence code, detailed by [[GO_stats-glossary#aspect |aspect]].
**'''[http://geneontology.org/docs/guide-go-evidence-codes by_evidence]:''' For each species, the number of annotations are shown for each individual evidence code, detailed by [[GO_stats-glossary#aspect |aspect]].
**'''[[GO_stats-glossary#evidence_cluster|by_evidence cluster]]:''' For each species, the number of annotations are shown for each [[GO_stats-glossary#evidence_cluster|evidence cluster]], detailed by [[GO_stats-glossary#aspect |aspect]].
**'''[[GO_stats-glossary#evidence_cluster|by_evidence cluster]]:''' For each species, the number of annotations are shown for each [[GO_stats-glossary#evidence_cluster|evidence cluster]], detailed by [[GO_stats-glossary#aspect |aspect]]. '''Not shown in tsv version.'''
** '''by_qualifier:''' contributes_to, colocalizes_with, NOT
** '''by_qualifier:''' contributes_to, colocalizes_with, NOT
*'''by_group:''' Changes in the total number of annotations for each contributing group, obtained using the 'assigned_by' field..
*'''by_group:''' Changes in the total number of annotations for each contributing group, obtained using the 'assigned_by' field.


===taxa===
===taxa===
Line 96: Line 96:
===references===
===references===
*'''all'''
*'''all'''
**'''total''' Change in annotated references.  
**'''total:''' Change in annotated references.  
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Change in annotated references by species.
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Change in annotated references by species.
**'''by_group:''' Change in annotated references for each contributing group, obtained using the 'assigned_by' field..
**'''by_group:''' Change in annotated references for each contributing group, obtained using the 'assigned_by' field.
*'''pmids'''
*'''pmids'''
**'''total''' Change in annotated PMIDs.  
**'''total:''' Change in annotated PMIDs.  
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Change in annotated PMIDs by species.
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Change in annotated PMIDs by species.
**'''by_group:''' Change in annotated PMIDs for each contributing group, obtained using the 'assigned_by' field..
**'''by_group:''' Change in annotated PMIDs for each contributing group, obtained using the 'assigned_by' field.


=Direct access to files=
=Direct access to files=
Line 115: Line 115:
= Review Status =
= Review Status =


Last reviewed: October 24, 2019
Last reviewed: March 5, 2020




[[Category:Release Pipeline]]
[[Category:Release Pipeline]]

Latest revision as of 21:25, 5 March 2020

Usage

This file is used for doing QC on annotations for GO releases.

Input data

Calculated from the go-stats.json files from the current and previous releases.

Format(s)

  • json
  • tsv. Note that the tsv is for QC checks and is slightly different (usually, fewer tables) compared to json.

File description

The go-annotation-changes file contains the following information:

summary

current

  • release_date: Obtained from release/metadata/release-date.json or snapshot/metadata/release-date.json.
  • annotations
    • total: Total number of annotations.
    • by_aspect: P, F, C.
    • by_evidence_cluster: PHYLO, IEA, OTHER, EXP, ND
, HTP.
    • by_qualifier: contributes_to, colocalizes_with, NOT
  • bioentities: Number of bioentities annotated in the GO database.
  • taxa: Number of species with annotations.
  • taxa_filtered: Number of species with at least 1,000 annotations.
  • references: Number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
  • pmids: Number of annotated PMIDs.


previous

  • Same information as for the current release.
  • Note that the date of the previous release is the most recent release in release.geneontology.org/, using the file release.geneontology.org/YYYY-MM-DD/metadata/release-date.json.


changes

Differences between the current and the previous release for all the fields above. In addition, for taxa, references and pmids, the number of added and removed items are counted. The 'changes' section contains the following information:

annotations

  • total: Change in the total number of annotations.
  • by_aspect: Changes in the total number of annotations for each aspect: P, F, C.
  • by_evidence_cluster: Changes in the total number of annotations for each evidence cluster (PHYLO, IEA, OTHER, EXP, ND
, HTP.).
  • by_qualifier: Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.

bioentities

Change in the number of annotated bioentities.

taxa

  • total: Changes in the total number of annotated species.
  • filtered: Changes in the number of annotated species by filtered taxa.
  • added: Number of new species annotated.
  • removed: Number of removed species having lost all annotations.

references

  • total: Change in the number of annotated references .
  • added : Number of newly annotated references (data not yet available).
  • removed: Number of references having lost all annotations (data not yet available).

pmids

  • total: Change in the number of annotated PMIDs.
  • added : Number of newly annotated PMIDs (data not yet available).
  • removed: Number of PMIDs having lost all annotations (data not yet available).


detailed_changes

All data in this section is shown as x/y, where x is the difference in the total number of annotations, y is the total number of annotations in the current release, and the % change in shown in parentheses.

annotations

  • total: Change in the total number of annotations.
  • by_aspect: Change in the total number of annotation for each aspect: P, F, C.
  • by_bioentity_type
  • by_qualifier: Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.
  • by_taxon: Change in the number of annotations by species.
  • by_evidence
  • by_model_organism
    • by_evidence: For each species, the number of annotations are shown for each individual evidence code, detailed by aspect.
    • by_evidence cluster: For each species, the number of annotations are shown for each evidence cluster, detailed by aspect. Not shown in tsv version.
    • by_qualifier: contributes_to, colocalizes_with, NOT
  • by_group: Changes in the total number of annotations for each contributing group, obtained using the 'assigned_by' field.

taxa

  • added: List of added species with current number of annotations.
  • removed: List of removed species with previous number of annotations.

bioentities

  • total: The difference in the total number of annotated bioentities.
  • by_type:
  • by_filtered_taxon: Difference in the number of annotated bioentities for each filtered species, detailed by aspect (A, P, F, C).
    • all: Difference in the number of annotated bioentities for each species.
    • cluster: Difference in the number of annotated bioentities grouped by bioentity clusters, for each species.

references

  • all
    • total: Change in annotated references.
    • by_filtered_taxon: Change in annotated references by species.
    • by_group: Change in annotated references for each contributing group, obtained using the 'assigned_by' field.
  • pmids
    • total: Change in annotated PMIDs.
    • by_filtered_taxon: Change in annotated PMIDs by species.
    • by_group: Change in annotated PMIDs for each contributing group, obtained using the 'assigned_by' field.

Direct access to files

snapshot

current

Review Status

Last reviewed: March 5, 2020