File Description: go-stats: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
m (→Review Status) |
||
(48 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=Usage= | |||
Primary stats file computed. | |||
= | |||
Primary | |||
=Input data = | =Input data = | ||
Annotation stats are obtained by querying GOlr | Annotation stats are obtained by querying the GOlr (GO Solr instance). | ||
= | =Format(s)= | ||
json | json | ||
=File description= | =File description= | ||
release_date | The <code>go-stats</code> file contains the following information: | ||
==release_date== | |||
*'''release_date:''' Obtained from <code>release/metadata/release-date.json</code> or <code>snapshot/metadata/release-date.json</code>. | |||
==ontology== | |||
* '''valid_terms:''' Total number of valid terms (non-obsolete) in the ontology. | |||
* '''obsolete_terms:''' Total number of terms with <code>obsolete</code> status (ie, <code>term_ids</code> for which the <code>is_obsolete</code> field is true in the <code>go.obo</code> file) (this excludes merges). | |||
* '''merged_terms:''' Total number of merged terms (calculated by counting the <code>term_ids</code> for which the field <code>is_obsolete</code> is true in the <code>go.obo</code> file, and that also are are as <code>alt_ids</code> of a valid term). | |||
* '''biological_process_terms:''' Total number of valid terms for the biological_process aspect. | |||
* '''molecular_function_terms:''' Total number of valid terms for the molecular_function aspect. | |||
* '''cellular_component_terms:''' Total number of valid terms for the cellular_component aspect. | |||
* '''meta_statements:''' Total number of identifiers, alternative identifiers, namespace, term label, comments, synonyms, definitions, subsets, for each valid term. | |||
* '''cross_references:''' Total number of cross_references, from the <code>xref</code> field of the <code>go.obo</code> file. | |||
* '''terms_relations:''' Total number of relations; the count of all relations, using the fields <code>is_a</code>, <code>intersection_of</code> and <code>relationship</code> of the <code>go.obo</code> file. | |||
* '''changes_created_terms:''' Number of created terms since the previous release. | |||
* '''changes_valid_terms:''' Number of valid terms since the previous release. | |||
* '''changes_obsolete_terms:''' Number of terms obsoleted since the previous release. | |||
* '''changes_merged_terms:''' Number of created merged since the previous release. | |||
* '''changes_biological_process_terms:''' Changes in the number of BP terms. | |||
* '''changes_molecular_function_terms":''' Changes in the number of MF terms. | |||
* '''changes_cellular_component_terms":'''Changes in the number of CC terms. | |||
by_taxon: Number of annotations for each of the annotated species in the database. | ==annotations== | ||
* '''total:''' The total number of annotations. | |||
* '''[[GO_stats-glossary#aspect |by_aspect]]''': P, F, C. | |||
* '''[[GO_stats-glossary#bioentity_type |by_bioentity_type]]:''' | |||
** '''[[GO_stats-glossary#bioentity_type| all]]''': Number of annotations for each bioentity type. | |||
** '''[[GO_stats-glossary#bioentity_type_cluster| cluster]]''': Number of annotations for each [[GO_stats-glossary#bioentity_type_cluster|bioentity type cluster]]. | |||
* '''by_qualifier:''' contributes_to, colocalizes_with, NOT | |||
* '''by_taxon''': Number of annotations for each of the annotated species in the database. | |||
* '''by_evidence''' | |||
** '''[http://geneontology.org/docs/guide-go-evidence-codes/ all]''' | |||
** '''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]''' | |||
* '''[[GO_stats-glossary#model_organism|by_model_organism]]:''' For each species, the number of annotations are shown: | |||
** '''[http://geneontology.org/docs/guide-go-evidence-codes/ by evidence]:''' number of annotations for each individual evidence code, detailed by [[GO_stats-glossary#aspect |aspect]]. | |||
** '''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]''': Number of annotations for each [[GO_stats-glossary#evidence_cluster|evidence cluster]] (PHYLO, IEA, OTHER, EXP, ND
, HTP), detailed by [[GO_stats-glossary#aspect |aspect]]. | |||
** '''by_qualifier:''' contributes_to, colocalizes_with, NOT | |||
* '''by_group:''' Number of annotation for each contributing group, obtained using the <code>assigned_by</code> field of each input file. | |||
==taxa== | |||
* '''total:''' Number of species with annotations. | |||
* '''filtered:''' Number of species with at least 1,000 annotations. | |||
==bioentities== | |||
*'''total: ''' Total number of annotated bioentities. | |||
* '''by_type''' | |||
** '''[[GO_stats-glossary#bioentity_type |all]]:''' Number of annotated bioentities by [[GO_stats-glossary#bioentity_type_cluster |bioentity type]]. | |||
** '''[[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]:''' Number of annotated bioentities grouped by [[GO_stats-glossary#bioentity_type_cluster |clusters]]. | |||
* '''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]: | |||
** '''[[GO_stats-glossary#bioentity_type |all]]''': number of annotations for each species, by [[GO_stats-glossary#bioentity_type |bioentity type]], detailed by aspect. | |||
** '''[[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]''': number of annotations for each species, by [[GO_stats-glossary#bioentity_type_cluster |bioentity_type_cluster]], detailed by aspect. | |||
by_filtered_taxon | |||
all: number of annotations for each species, by bioentity_type | |||
==references== | |||
*'''all''' | |||
**'''total:''' Total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice). | |||
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Total number of annotated references by species. | |||
**'''by_group:''' Total number of annotated references for each contributing group, obtained using the <code>assigned_by</code> field. | |||
*'''pmids''' | |||
**'''total:''' Total number of annotated PMIDs. | |||
**'''[[GO_stats-glossary#filtered_taxa |by_filtered_taxon]]:''' Total number of annotated PMIDs by species. | |||
**'''by_group:''' Total number of annotated PMIDs for each contributing group, obtained using the <code>assigned_by</code> field. | |||
=Direct access to files= | |||
==snapshot== | |||
http://snapshot.geneontology.org/release_stats/go-stats.json | |||
==current== | |||
http://current.geneontology.org/release_stats/go-stats.json | |||
= Review Status = | = Review Status = | ||
Last reviewed: | Last reviewed: March 5, 2020 | ||
[[Category:Release Pipeline]] | [[Category:Release Pipeline]] |
Latest revision as of 19:56, 5 March 2020
Usage
Primary stats file computed.
Input data
Annotation stats are obtained by querying the GOlr (GO Solr instance).
Format(s)
json
File description
The go-stats
file contains the following information:
release_date
- release_date: Obtained from
release/metadata/release-date.json
orsnapshot/metadata/release-date.json
.
ontology
- valid_terms: Total number of valid terms (non-obsolete) in the ontology.
- obsolete_terms: Total number of terms with
obsolete
status (ie,term_ids
for which theis_obsolete
field is true in thego.obo
file) (this excludes merges). - merged_terms: Total number of merged terms (calculated by counting the
term_ids
for which the fieldis_obsolete
is true in thego.obo
file, and that also are are asalt_ids
of a valid term). - biological_process_terms: Total number of valid terms for the biological_process aspect.
- molecular_function_terms: Total number of valid terms for the molecular_function aspect.
- cellular_component_terms: Total number of valid terms for the cellular_component aspect.
- meta_statements: Total number of identifiers, alternative identifiers, namespace, term label, comments, synonyms, definitions, subsets, for each valid term.
- cross_references: Total number of cross_references, from the
xref
field of thego.obo
file. - terms_relations: Total number of relations; the count of all relations, using the fields
is_a
,intersection_of
andrelationship
of thego.obo
file. - changes_created_terms: Number of created terms since the previous release.
- changes_valid_terms: Number of valid terms since the previous release.
- changes_obsolete_terms: Number of terms obsoleted since the previous release.
- changes_merged_terms: Number of created merged since the previous release.
- changes_biological_process_terms: Changes in the number of BP terms.
- changes_molecular_function_terms": Changes in the number of MF terms.
- changes_cellular_component_terms":Changes in the number of CC terms.
annotations
- total: The total number of annotations.
- by_aspect: P, F, C.
- by_bioentity_type:
- all: Number of annotations for each bioentity type.
- cluster: Number of annotations for each bioentity type cluster.
- by_qualifier: contributes_to, colocalizes_with, NOT
- by_taxon: Number of annotations for each of the annotated species in the database.
- by_evidence
- by_model_organism: For each species, the number of annotations are shown:
- by evidence: number of annotations for each individual evidence code, detailed by aspect.
- by_evidence_cluster: Number of annotations for each evidence cluster (PHYLO, IEA, OTHER, EXP, ND , HTP), detailed by aspect.
- by_qualifier: contributes_to, colocalizes_with, NOT
- by_group: Number of annotation for each contributing group, obtained using the
assigned_by
field of each input file.
taxa
- total: Number of species with annotations.
- filtered: Number of species with at least 1,000 annotations.
bioentities
- total: Total number of annotated bioentities.
- by_type
- all: Number of annotated bioentities by bioentity type.
- by_type_cluster: Number of annotated bioentities grouped by clusters.
- by_filtered_taxon:
- all: number of annotations for each species, by bioentity type, detailed by aspect.
- by_type_cluster: number of annotations for each species, by bioentity_type_cluster, detailed by aspect.
references
- all
- total: Total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
- by_filtered_taxon: Total number of annotated references by species.
- by_group: Total number of annotated references for each contributing group, obtained using the
assigned_by
field.
- pmids
- total: Total number of annotated PMIDs.
- by_filtered_taxon: Total number of annotated PMIDs by species.
- by_group: Total number of annotated PMIDs for each contributing group, obtained using the
assigned_by
field.
Direct access to files
snapshot
http://snapshot.geneontology.org/release_stats/go-stats.json
current
http://current.geneontology.org/release_stats/go-stats.json
Review Status
Last reviewed: March 5, 2020