File Description: go-stats: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
mNo edit summary
Line 2: Line 2:
   IN PROGRESS  
   IN PROGRESS  


=File usage=
=Usage=
Primary stat file computed.
Primary stat file computed.


Line 8: Line 8:
Annotation stats are obtained by querying GOlr[http://golr-aux.geneontology.io/].  ***IS THIS THE RIGHT LINK???***
Annotation stats are obtained by querying GOlr[http://golr-aux.geneontology.io/].  ***IS THIS THE RIGHT LINK???***


=File format(s)=
=Format(s)=
json
json


=File description=
=File description=
The <code>go-stats</code> file contains the following information:
==release_date==
==release_date==
*'''release_date:''' Obtained from <code>release/metadata/release-date.json</code> (or <code>snapshot/metadata/release-date.json</code>).
*'''release_date:''' Obtained from <code>release/metadata/release-date.json</code> (or <code>snapshot/metadata/release-date.json</code>).
Line 67: Line 68:
*** by_group
*** by_group


=Direct access to files=
==snapshot==
http://snapshot.geneontology.org/release_stats/go-stats.json
==current==
http://current.geneontology.org/release_stats/go-stats.json
= Review Status =
= Review Status =



Revision as of 09:12, 24 October 2019

  IN PROGRESS 

Usage

Primary stat file computed.

Input data

Annotation stats are obtained by querying GOlr[1]. ***IS THIS THE RIGHT LINK???***

Format(s)

json

File description

The go-stats file contains the following information:

release_date

  • release_date: Obtained from release/metadata/release-date.json (or snapshot/metadata/release-date.json).

ontology

  • valid_terms: Total number of valid terms (non-obsolete) in the ontology.
  • obsolete_terms: Total number of terms with obsolete status (ie, term_ids for which the is_obsolete field is true in the go.obo file) (this excludes merges).
  • merged_terms: Total number of merged terms (calculated by counting the term_ids for which the field is_obsolete is true in the go.obo file, and that also are are as alt_ids of a valid term).
  • biological_process_terms: Total number of valid terms for the biological_process aspect.
  • molecular_function_terms: Total number of valid terms for the molecular_function aspect.
  • cellular_component_terms: Total number of valid terms for the cellular_component aspect.
  • meta_statements: Total number of identifiers, alternative identifiers, namespace, term label, comments, synonyms, definitions, subsets, for each valid term.
  • cross_references: Total number of cross_references, from the xref field of the go.obo file.
  • terms_relations: Total number of relations; the count of all relations, using the fields is_a, intersection_of and relationship of the go.obo file.
  • changes_created_terms: Number of created terms since the previous release.
  • changes_obsolete_terms: Number of terms obsoleted since the previous release.
  • changes_merged_terms: Number of created merged since the previous release.

annotations

  • total: The total number of annotations.
  • by_aspect
    • P: all annotations in the database for biological_process.
    • F:all annotations in the database for molecular_function.
    • C:all annotations in the database for cellular_component.
  • by_bioentity_type
    • all (same as bioentities > total > by_type>all)
    • cluster (same as bioentities > total > by_type>by_bioentity_type_cluster)
    • by_taxon: Number of annotations for each of the annotated species in the database.
  • by_evidence
    • all
    • cluster
  • by_model_organism: For each species, the number of annotations are shown:
  • by evidence: number of annotations for each individual evidence code, detailed by aspect
  • by_evidence_cluster: number of annotations for each evidence cluster, detailed by aspect.
  • by_group: Number of annotation by each group, obtained using the 'assigned_by' field.
  • taxa
    • total: number of species with annotations.
    • filtered: number of species with > 1,000 annotations.
  • bioentities
    • total: total number of annotated bioentities
    • by_type
      • all (see list of bioentity types)
      • cluster
    • by_filtered_taxon
      • all: number of annotations for each species, by bioentity_type and by aspect.
      • cluster: number of annotations for each species, by bioentity_type_cluster and by aspect.
  • references
    • all
      • total: total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
      • by_filtered_taxon
      • by group
    • pmid: same as above, filtered for pmids.
      • total: total number of distinct pmids.
      • by_filtered_taxon
      • by_group

Direct access to files

snapshot

http://snapshot.geneontology.org/release_stats/go-stats.json

current

http://current.geneontology.org/release_stats/go-stats.json

Review Status

Last reviewed: October 17, 2019