File Description: go-stats-summary: Difference between revisions
Jump to navigation
Jump to search
m (→Input data) |
m (→Review Status) |
||
(28 intermediate revisions by the same user not shown) | |||
Line 11: | Line 11: | ||
=File description= | =File description= | ||
==release date== | ==release date== | ||
Obtained from release/metadata/release-date.json (or snapshot/metadata/release-date.json). | Obtained from <code>release/metadata/release-date.json</code> (or <code>snapshot/metadata/release-date.json</code>). | ||
==ontology == | ==ontology == | ||
* '''valid_terms:''' Total number of valid terms (non-obsolete) in the ontology. | * '''valid_terms:''' Total number of valid terms (non-obsolete) in the ontology. | ||
* '''obsolete_terms:''' Total number of terms with | * '''obsolete_terms:''' Total number of terms with <code>obsolete</code> status (ie, <code>term_ids</code> for which the <code>is_obsolete</code> field is true in the <code>go.obo</code> file) (this excludes merges). | ||
* '''merged_terms:''' Total number of merged terms (calculated by counting the term_ids for which the field is_obsolete is true in the <code>go.obo</code> file, and that also are are as alt_ids of a valid term). | * '''merged_terms:''' Total number of merged terms (calculated by counting the <code>term_ids</code> for which the field <code>is_obsolete</code> is true in the <code>go.obo</code> file, and that also are are as <code>alt_ids</code> of a valid term). | ||
* '''biological_process_terms:''' Total number of valid terms for the biological_process aspect. | * '''biological_process_terms:''' Total number of valid terms for the biological_process aspect. | ||
* '''molecular_function_terms:''' Total number of valid terms for the molecular_function aspect. | * '''molecular_function_terms:''' Total number of valid terms for the molecular_function aspect. | ||
Line 23: | Line 24: | ||
* '''terms_relations:''' Total number of relations; the count of all relations, using the fields <code>is_a</code>, <code>intersection_of</code> and <code>relationship</code> of the <code>go.obo</code> file. | * '''terms_relations:''' Total number of relations; the count of all relations, using the fields <code>is_a</code>, <code>intersection_of</code> and <code>relationship</code> of the <code>go.obo</code> file. | ||
* '''changes_created_terms:''' Number of created terms since the previous release. | * '''changes_created_terms:''' Number of created terms since the previous release. | ||
* '''changes_valid_terms:''' Number of valid terms compared to the previous release. | |||
* '''changes_obsolete_terms:''' Number of terms obsoleted since the previous release. | * '''changes_obsolete_terms:''' Number of terms obsoleted since the previous release. | ||
* '''changes_merged_terms:''' Number of created merged since the previous release. | * '''changes_merged_terms:''' Number of created merged since the previous release. | ||
* '''changes_biological_process_terms:''' Changes in the number of valid terms for the biological_process aspect. | |||
* '''changes_molecular_function_terms:''' Changes in the number of valid terms for the molecular_function aspect. | |||
* '''changes_cellular_component_terms:''' Changes in the number of valid terms for the cellular_component aspect. | |||
==annotations== | ==annotations== | ||
*'''total | * '''total''': The total number of annotations. | ||
*'''total_no_pb:''' | *'''total_no_pb:''' Total number of annotations, excluding direct annotations to <code>GO:0005515 protein binding</code>. | ||
*'''by_aspect''' | *'''total_pb:''' Total number of direct annotations to <code>GO:0005515 protein binding</code>. | ||
*'''by_bioentity_type_cluster:''' annotations | *'''[[GO_stats-glossary#aspect |by_aspect]]''': P, F, C, B. | ||
*'''[[GO_stats-glossary#bioentity_type_cluster |by_bioentity_type_cluster]]:''' Number of annotations for each [[GO_stats-glossary#bioentity_type |bioentities type cluster]]. | |||
*'''[[GO_stats-glossary#bioentity_type_cluster |bioentities type cluster]]_no_pb:''' Number of annotations for each [[GO_stats-glossary#bioentity_type |bioentities type cluster]], excluding direct annotations to <code>GO:0005515 protein binding</code>. | |||
* '''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]''': Number of annotations for each [[GO_stats-glossary#evidence_cluster| evidence cluster]] (PHYLO, IEA, OTHER, EXP, ND
, HTP.). | |||
*'''[[GO_stats-glossary#evidence_cluster|by_evidence_cluster]]_no_pb''': Number of annotations for each [[GO_stats-glossary#evidence_cluster| evidence cluster]], excluding direct annotations to <code>GO:0005515 protein binding</code>. | |||
*'''[[GO_stats-glossary#model_organism|by_model_organism]]:''' human and 10 model organisms. For each species, the number of annotations are shown: | |||
** '''[[GO_stats-glossary#evidence_cluster|by evidence cluster]]''', and detailed by [[GO_stats-glossary#aspect |aspect]] (P, F, C, B). | |||
*''' | ** '''by_qualifier:''' Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT. | ||
*'''by_evidence_cluster | |||
*'''by_model_organism:''' human and 10 model organisms. | ==taxa== | ||
For each species, the number of annotations are shown: | * '''total:''' Number of annotated species. | ||
* '''filtered:''' Number of species with > 1,000 annotations. | |||
*''' | |||
* | |||
* | |||
==bioentities== | ==bioentities== | ||
*'''total: ''' | *'''total: ''' Total number of annotated bioentities | ||
*'''total_no_pb:''' | *'''total_no_pb:''' Total number of annotated [[GO_stats-glossary#bioentity_type|bioentities]], excluding direct annotations to <code>GO:0005515 protein binding</code>. | ||
*'''by_type_cluster:''' | *'''[[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]:''' Number of annotated bioentities grouped by [[GO_stats-glossary#bioentity_type_cluster |clusters]]. | ||
*'''[[GO_stats-glossary#bioentity_type_cluster |by_type_cluster]]_no_pb:''' Number of annotated bioentities grouped by [[GO_stats-glossary#bioentity_type |clusters]], excluding direct annotations to <code>GO:0005515 protein binding</code>. | |||
*'''[[GO_stats-glossary#model_organism|by_model_organism]]:''' Number of annotated bioentities for human and each of the 10 model organisms. For each species, the number of annotated bioentities are grouped by [[GO_stats-glossary#bioentity_type_cluster |bioentity clusters]], and detailed by [[GO_stats-glossary#aspect |aspect]] (A, P, F, C). | |||
*''' | |||
*'''by_model_organism:''' | |||
==references== | ==references== | ||
===all: === | ===all: === | ||
*'''total:''' | *'''total:''' Total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice). | ||
*'''total_no_pb:''' | *'''total_no_pb:''' Total number of annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome), excluding papers only directly annotated to ‘GO:0005515 | ||
*'''by_model_organism:''' | *'''[[GO_stats-glossary#model_organism|by_model_organism]]:''' Total number of annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome), for human and each of the 10 model organisms. | ||
===pmid:=== | ===pmid:=== | ||
*'''total:''' | *'''total:''' Total number of annotated PMIDs | ||
*'''total_no_pb:''' | *'''total_no_pb:''' Total number of annotated PMIDs, excluding papers only directly annotated to "GO:0005515". | ||
*'''by_model_organism:''' | *'''[[GO_stats-glossary#model_organism|by_model_organism]]:''' Total number of annotated PMIDs, for human and each of the 10 model organisms. | ||
=Direct access to files= | |||
==snapshot== | |||
* http://snapshot.geneontology.org/release_stats/go-stats-summary.json | |||
==current== | |||
* http://current.geneontology.org/release_stats/go-stats-summary.json | |||
= Review Status = | = Review Status = | ||
Last reviewed: | Last reviewed: March 5, 2020 | ||
[[Category:Release Pipeline]] | [[Category:Release Pipeline]] |
Latest revision as of 17:44, 6 March 2020
File usage
This file provides a summary of the statistics for both the ontology and the annotations. The data presented on the GO website is obtained from this file.
Input data
Ontology stats are obtained from the current go.obo
file; data is extracted from the go-ontology-changes.json file.
Annotation stats are extracted from the go-stats.json file.
File format(s)
json
File description
release date
Obtained from release/metadata/release-date.json
(or snapshot/metadata/release-date.json
).
ontology
- valid_terms: Total number of valid terms (non-obsolete) in the ontology.
- obsolete_terms: Total number of terms with
obsolete
status (ie,term_ids
for which theis_obsolete
field is true in thego.obo
file) (this excludes merges). - merged_terms: Total number of merged terms (calculated by counting the
term_ids
for which the fieldis_obsolete
is true in thego.obo
file, and that also are are asalt_ids
of a valid term). - biological_process_terms: Total number of valid terms for the biological_process aspect.
- molecular_function_terms: Total number of valid terms for the molecular_function aspect.
- cellular_component_terms: Total number of valid terms for the cellular_component aspect.
- meta_statements: Total number of identifiers, alternative identifiers, namespace, term label, comments, synonyms, definitions, subsets, for each valid term.
- cross_references: Total number of cross_references, from the
xref
field of thego.obo
file. - terms_relations: Total number of relations; the count of all relations, using the fields
is_a
,intersection_of
andrelationship
of thego.obo
file. - changes_created_terms: Number of created terms since the previous release.
- changes_valid_terms: Number of valid terms compared to the previous release.
- changes_obsolete_terms: Number of terms obsoleted since the previous release.
- changes_merged_terms: Number of created merged since the previous release.
- changes_biological_process_terms: Changes in the number of valid terms for the biological_process aspect.
- changes_molecular_function_terms: Changes in the number of valid terms for the molecular_function aspect.
- changes_cellular_component_terms: Changes in the number of valid terms for the cellular_component aspect.
annotations
- total: The total number of annotations.
- total_no_pb: Total number of annotations, excluding direct annotations to
GO:0005515 protein binding
. - total_pb: Total number of direct annotations to
GO:0005515 protein binding
. - by_aspect: P, F, C, B.
- by_bioentity_type_cluster: Number of annotations for each bioentities type cluster.
- bioentities type cluster_no_pb: Number of annotations for each bioentities type cluster, excluding direct annotations to
GO:0005515 protein binding
. - by_evidence_cluster: Number of annotations for each evidence cluster (PHYLO, IEA, OTHER, EXP, ND , HTP.).
- by_evidence_cluster_no_pb: Number of annotations for each evidence cluster, excluding direct annotations to
GO:0005515 protein binding
. - by_model_organism: human and 10 model organisms. For each species, the number of annotations are shown:
- by evidence cluster, and detailed by aspect (P, F, C, B).
- by_qualifier: Changes in the total number of annotations using qualifiers: contributes_to, colocalizes_with, NOT.
taxa
- total: Number of annotated species.
- filtered: Number of species with > 1,000 annotations.
bioentities
- total: Total number of annotated bioentities
- total_no_pb: Total number of annotated bioentities, excluding direct annotations to
GO:0005515 protein binding
. - by_type_cluster: Number of annotated bioentities grouped by clusters.
- by_type_cluster_no_pb: Number of annotated bioentities grouped by clusters, excluding direct annotations to
GO:0005515 protein binding
. - by_model_organism: Number of annotated bioentities for human and each of the 10 model organisms. For each species, the number of annotated bioentities are grouped by bioentity clusters, and detailed by aspect (A, P, F, C).
references
all:
- total: Total number of distinct annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome (note that for papers with both a PMID and an internal reference ID, the paper is counted twice).
- total_no_pb: Total number of annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome), excluding papers only directly annotated to ‘GO:0005515
- by_model_organism: Total number of annotated references (includes PMIDs, GO_REFs, DOIs, internal IDs for Model Organism Databases and Reactome), for human and each of the 10 model organisms.
pmid:
- total: Total number of annotated PMIDs
- total_no_pb: Total number of annotated PMIDs, excluding papers only directly annotated to "GO:0005515".
- by_model_organism: Total number of annotated PMIDs, for human and each of the 10 model organisms.
Direct access to files
snapshot
current
Review Status
Last reviewed: March 5, 2020