Difference between revisions of "File Description: go-stats"

From GO Wiki
Jump to: navigation, search
m (taxa)
m (bioentities)
Line 50: Line 50:
  
 
==bioentities==
 
==bioentities==
* total: total number of annotated bioentities
+
*'''total: ''' Total number of annotated bioentities.
 
* by_type
 
* by_type
** all (see list of bioentity types)
+
** *'''[[GO_stats-glossary#bioentity_type |by_type]]:''' Number of annotated bioentities by [[GO_stats-glossary#bioentity_type_cluster |bioentity type]].
** cluster
+
** *'''[[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]:''' Number of annotated bioentities grouped by [[GO_stats-glossary#bioentity_type_cluster |clusters]].
* by_filtered_taxon
+
* **'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:
** all: number of annotations for each species, by bioentity_type and by aspect.
+
** [[GO_stats-glossary#bioentity_type |all]]:: number of annotations for each species, by bioentity_type and by [[GO_stats-glossary#aspect |aspect]].
** cluster: number of annotations for each species, by bioentity_type_cluster and by aspect.
+
** [[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]: number of annotations for each species, by [[GO_stats-glossary#bioentity_type_cluster |bioentity_type_cluster]] and by [[GO_stats-glossary#aspect |aspect]].
 +
 
 
==references==
 
==references==
 
* all  
 
* all  

Revision as of 06:55, 24 October 2019

  IN PROGRESS 

Usage

Primary stat file computed.

Input data

Annotation stats are obtained by querying GOlr[1]. ***IS THIS THE RIGHT LINK???***

Format(s)

json

File description

The go-stats file contains the following information:

release_date

  • release_date: Obtained from release/metadata/release-date.json (or snapshot/metadata/release-date.json).

ontology

  • valid_terms: Total number of valid terms (non-obsolete) in the ontology.
  • obsolete_terms: Total number of terms with obsolete status (ie, term_ids for which the is_obsolete field is true in the go.obo file) (this excludes merges).
  • merged_terms: Total number of merged terms (calculated by counting the term_ids for which the field is_obsolete is true in the go.obo file, and that also are are as alt_ids of a valid term).
  • biological_process_terms: Total number of valid terms for the biological_process aspect.
  • molecular_function_terms: Total number of valid terms for the molecular_function aspect.
  • cellular_component_terms: Total number of valid terms for the cellular_component aspect.
  • meta_statements: Total number of identifiers, alternative identifiers, namespace, term label, comments, synonyms, definitions, subsets, for each valid term.
  • cross_references: Total number of cross_references, from the xref field of the go.obo file.
  • terms_relations: Total number of relations; the count of all relations, using the fields is_a, intersection_of and relationship of the go.obo file.
  • changes_created_terms: Number of created terms since the previous release.
  • changes_obsolete_terms: Number of terms obsoleted since the previous release.
  • changes_merged_terms: Number of created merged since the previous release.

annotations

taxa

  • taxa: Number of species with annotations.
  • taxa_filtered: Number of species with at least 1,000 annotations.

bioentities

references

  • all
    • total: total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
    • by_filtered_taxon
    • by group
  • pmid: same as above, filtered for pmids.
    • total: total number of distinct pmids.
    • by_filtered_taxon
    • by_group

Direct access to files

snapshot

http://snapshot.geneontology.org/release_stats/go-stats.json

current

http://current.geneontology.org/release_stats/go-stats.json

Review Status

Last reviewed: October 17, 2019