GO slim overhaul (completed 2009)
REDONE 2023
See https://github.com/geneontology/go-ontology/issues/22294
We plan to completely revise the generic GO slim, and in the process come up with some guidelines for developing slims. This project began in Sept 2009.
Personnel
Jane, Val
Notes
- Ideally want three "generic" slims:
- truly generic slim for all species (probably very high level indeed)
- slim for multi-cellular organisms
- slim for single-celled organisms
- could also do eukaryotic vs. prokaryotic
- Other considerations
- Email with specific queries about terms included and omitted from current (as of June 2009 page creation, actual last update much earlier)
- Mailing list thread with Val's criteria for term selection Amelia comment:Most of this might be automatable, the hardest part of automatation will be to identify "biologically relevent terms". may be necessary to look at "extermal sources of terms"
- Can we avoid "other X" terms? They're confusing and hard to handle.
- Try to make slims is_a complete
Tangentially related: Does anyone know of a web-based slimming tool which shows the number of gene products which are annotated, but not to any term in your slim, and the number which are not annotated to any GO term (i.e root node annotations)? (question from Val) Linda suggested:
- slim for metagenomics
- Collecting purpose made slims
Meetings
Terms:
- cellular component assembly ; GO:0022607
- anatomical structure formation involved in morphogenesis ; GO:0048646
- cell adhesion ; GO:0007155
- look at multicellular organism adhesion later
- look at "biological regulation" later
- look at cell killing later (may be picked up if we include multicellular organismal process although this is a bit broad)
- protein complex assembly ; GO:0006461
- ribonucleoprotein complex assembly ; GO:0022618
- cell wall organization or biogenesis ; GO:0071554
- extracellular matrix organization ; GO:0030198
- membrane organization ; GO:0061024
- chromosome organization ; GO:0051276
- cytoskeletal organization ; GO:0007010
- cell division ; GO:0051301
- growth ; GO:0040007
- cell proliferation ; GO:0008283
- cell wall organization or biogenesis ; GO:0071554
- later - cell junction org, cell projecgtion org
- cell differentiation ; GO:0030154
- cell morphogenesis ; GO:0000902
- cell motility ; GO:0048870
- homeostatic process ; GO:0042592
- vesicle-mediated transport ; GO:0016192
- nucleocytoplasmic transport ; GO:0006913
- transport ; GO:0006810
- transmembrane transport ; GO:0055085
- macromolecular complex assembly ; GO:0065003
- plasma membrane organization ; GO:0007009 ?
- chromosome organization ; GO:0051276
- cytoskeleton organization ; GO:0007010
- mitochondrion organization ; GO:0007005
- extracellular matrix organization ; GO:0030198
- cell junction organization ; GO:0034330
- pigmentation ; GO:0043473
- reproduction ; GO:0000003
- transposition ; GO:0032196
- immune system process ; GO:0002376
- locomotion ; GO:0040011
- biosynthetic process ; GO:0009058
- catabolic process ; GO:0009056
- DNA metabolic process ; GO:0006259
- transcription ; GO:0006350
- generation of precursor metabolites and energy ; GO:0006091
- cellular amino acid and derivative metabolic process ; GO:0006519
- cellular nucleobase, nucleoside and nucleotide metabolic process ; GO:0034655
- cofactor metabolic process ; GO:0051186
- photosynthesis ; GO:0015979
- small molecule metabolic process ; GO:0044281
- sulfur metabolic process ; GO:0006790
- secondary metabolic process ; GO:0019748
- vitamin metabolic process ; GO:0006766 * (need this?)
- carbohydrate metabolic process ; GO:0005975
- lipid metabolic process ; GO:0006629
- translation ; GO:0006412
- protein folding ; GO:0006457
- protein modification process ; GO:0006464
- protein maturation ; GO:0051604
- symbiosis, encompassing mutualism through parasitism; GO:0044403
- developmental maturation ; GO:0021700 (might be able remove this one)
- anatomical structure development ; GO:0048856
- embryonic development ; GO:0009790
- circulatory system process ; GO:0003013
- neurological system process ; GO:0050877
- renal system process ; GO:0003014
- muscle system process ; GO:0003012
- (note: we'll probably need some other system processes here)
- response to stress ; GO:0006950
- signaling ; GO:0023052
- nitrogen cycle metabolic process ; GO:0071941
I added the following terms to get complete coverage for pombe
- GO:0006605 protein targeting
- GO:0006399 tRNA metabolic process
- GO:0006397 mRNA processing
- GO:0034641 cellular nitrogen compound metabolic process
- GO:0042254 ribosome biogenesis
- GO:0007059 chromosome segregation
Meeting Dates:
- 21 August
- 8 Sept
- 7 Oct
- 25 Nov
- 21 Jan
- 2 Feb
- 11 May
TODO:
Val to translate list of slim terms to ids to check bucket terms
(when we choose terms we should also consider IEA annotations as will give a better idea of how many gene products are likely to be to a term when "annotation complete")
- Need to re-check how annotations are allocated to this slim when regulates is made non-transitive in map2slim (new in GO moose).
QuickGO query:
!evidence=IEA & !evidence=ND & ancestor=GO:0008150 & !ancestor=GO:0022607, GO:0048646, GO:0007155, GO:0006461, GO:0022618, GO:0071554, GO:0030198, GO:0061024, GO:0051276, GO:0007010, GO:0007067, GO:0007568, GO:0007165, GO:0007267, GO:0007049, GO:0008219, GO:0051301, GO:0040007, GO:0008283, GO:0071554, GO:0030154, GO:0000902, GO:0048870, GO:0042592, GO:0016192, GO:0006913, GO:0055085, GO:0065003, GO:0007009, GO:0051276, GO:0007010, GO:0007005, GO:0030198, GO:0034330, GO:0043473, GO:0000003, GO:0032196, GO:0002376, GO:0040011, GO:0009058, GO:0009056, GO:0006259, GO:0061018, GO:0006091, GO:0006519, GO:0034655, GO:0051186, GO:0015979, GO:0044281, GO:0006790, GO:0019748, GO:0006766, GO:0005975, GO:0006629, GO:0006412, GO:0006457, GO:0006464, GO:0051604, GO:0044403, GO:0021700, GO:0048856, GO:0009790, GO:0003013, GO:0050877, GO:0006950, GO:0006605, GO:0006399, GO:0006397, GO:0034641, GO:0042254, GO:0007059, GO:0065007, GO:0023052, GO:0050896, GO:0006810 (put individual system process in)
Issues to consider
- How will this slim be maintained to ensure it keeps in line with ontology rearrangements? Yearly revision? More frequent?
- What other slims should we maintain in the GO file: cellular v/s multicellular v/s multi-organism? Euk v/s prok?
- Mapping over different relations for map2slim - regulates?