GO slim overhaul (completed 2009)

From GO Wiki
Jump to: navigation, search

We plan to completely revise the generic GO slim, and in the process come up with some guidelines for developing slims. This project began in Sept 2009.

Personnel

Jane, Val

Notes

  • Ideally want three "generic" slims:
    • truly generic slim for all species (probably very high level indeed)
    • slim for multi-cellular organisms
    • slim for single-celled organisms
  • could also do eukaryotic vs. prokaryotic
  • Other considerations
    • Email with specific queries about terms included and omitted from current (as of June 2009 page creation, actual last update much earlier)
    • Mailing list thread with Val's criteria for term selection Amelia comment:Most of this might be automatable, the hardest part of automatation will be to identify "biologically relevent terms". may be necessary to look at "extermal sources of terms"
    • Can we avoid "other X" terms? They're confusing and hard to handle.
    • Try to make slims is_a complete

Tangentially related: Does anyone know of a web-based slimming tool which shows the number of gene products which are annotated, but not to any term in your slim, and the number which are not annotated to any GO term (i.e root node annotations)? (question from Val)

Linda suggested:

  1. slim for metagenomics
  2. Collecting purpose made slims

Meetings

Terms:

  1. cellular component assembly ; GO:0022607
  2. anatomical structure formation involved in morphogenesis ; GO:0048646
  3. cell adhesion ; GO:0007155
    • look at multicellular organism adhesion later
    • look at "biological regulation" later
    • look at cell killing later (may be picked up if we include multicellular organismal process although this is a bit broad)
  4. protein complex assembly ; GO:0006461
  5. ribonucleoprotein complex assembly ; GO:0022618
  6. cell wall organization or biogenesis ; GO:0071554
  7. extracellular matrix organization ; GO:0030198
  8. membrane organization ; GO:0061024
  9. chromosome organization ; GO:0051276
  10. cytoskeletal organization ; GO:0007010
  11. cell division ; GO:0051301
  12. growth ; GO:0040007
  13. cell proliferation ; GO:0008283
  14. cell wall organization or biogenesis ; GO:0071554
    • later - cell junction org, cell projecgtion org
  15. cell differentiation ; GO:0030154
  16. cell morphogenesis ; GO:0000902
  17. cell motility ; GO:0048870
  18. homeostatic process ; GO:0042592
  19. vesicle-mediated transport ; GO:0016192
  20. nucleocytoplasmic transport ; GO:0006913
  21. transport ; GO:0006810
  22. transmembrane transport ; GO:0055085
  23. macromolecular complex assembly ; GO:0065003
  24. plasma membrane organization ; GO:0007009 ?
  25. chromosome organization ; GO:0051276
  26. cytoskeleton organization ; GO:0007010
  27. mitochondrion organization ; GO:0007005
  28. extracellular matrix organization ; GO:0030198
  29. cell junction organization ; GO:0034330
  30. pigmentation ; GO:0043473
  31. reproduction ; GO:0000003
  32. transposition ; GO:0032196
  33. immune system process ; GO:0002376
  34. locomotion ; GO:0040011
  35. biosynthetic process ; GO:0009058
  36. catabolic process ; GO:0009056
  37. DNA metabolic process ; GO:0006259
  38. transcription ; GO:0006350
  39. generation of precursor metabolites and energy ; GO:0006091
  40. cellular amino acid and derivative metabolic process ; GO:0006519
  41. cellular nucleobase, nucleoside and nucleotide metabolic process ; GO:0034655
  42. cofactor metabolic process ; GO:0051186
  43. photosynthesis ; GO:0015979
  44. small molecule metabolic process ; GO:0044281
  45. sulfur metabolic process ; GO:0006790
  46. secondary metabolic process ; GO:0019748
  47. vitamin metabolic process ; GO:0006766 * (need this?)
  48. carbohydrate metabolic process ; GO:0005975
  49. lipid metabolic process ; GO:0006629
  50. translation ; GO:0006412
  51. protein folding ; GO:0006457
  52. protein modification process ; GO:0006464
  53. protein maturation ; GO:0051604
  54. symbiosis, encompassing mutualism through parasitism; GO:0044403
  55. developmental maturation ; GO:0021700 (might be able remove this one)
  56. anatomical structure development ; GO:0048856
  57. embryonic development ; GO:0009790
  58. circulatory system process ; GO:0003013
  59. neurological system process ; GO:0050877
  60. renal system process ; GO:0003014
  61. muscle system process ; GO:0003012
    • (note: we'll probably need some other system processes here)
  62. response to stress ; GO:0006950
  63. signaling ; GO:0023052
  64. nitrogen cycle metabolic process ; GO:0071941

I added the following terms to get complete coverage for pombe

  1. GO:0006605 protein targeting
  2. GO:0006399 tRNA metabolic process
  3. GO:0006397 mRNA processing
  4. GO:0034641 cellular nitrogen compound metabolic process
  5. GO:0042254 ribosome biogenesis
  6. GO:0007059 chromosome segregation

Meeting Dates:

  • 21 August
  • 8 Sept
  • 7 Oct
  • 25 Nov
  • 21 Jan
  • 2 Feb
  • 11 May

TODO:

Val to translate list of slim terms to ids to check bucket terms

(when we choose terms we should also consider IEA annotations as will give a better idea of how many gene products are likely to be to a term when "annotation complete")

  • Need to re-check how annotations are allocated to this slim when regulates is made non-transitive in map2slim (new in GO moose).

QuickGO query:

!evidence=IEA & !evidence=ND & ancestor=GO:0008150 & !ancestor=GO:0022607, GO:0048646, GO:0007155, GO:0006461, GO:0022618, GO:0071554, GO:0030198, GO:0061024, GO:0051276, GO:0007010, GO:0007067, GO:0007568, GO:0007165, GO:0007267, GO:0007049, GO:0008219, GO:0051301, GO:0040007, GO:0008283, GO:0071554, GO:0030154, GO:0000902, GO:0048870, GO:0042592, GO:0016192, GO:0006913, GO:0055085, GO:0065003, GO:0007009, GO:0051276, GO:0007010, GO:0007005, GO:0030198, GO:0034330, GO:0043473, GO:0000003, GO:0032196, GO:0002376, GO:0040011, GO:0009058, GO:0009056, GO:0006259, GO:0061018, GO:0006091, GO:0006519, GO:0034655, GO:0051186, GO:0015979, GO:0044281, GO:0006790, GO:0019748, GO:0006766, GO:0005975, GO:0006629, GO:0006412, GO:0006457, GO:0006464, GO:0051604, GO:0044403, GO:0021700, GO:0048856, GO:0009790, GO:0003013, GO:0050877, GO:0006950, GO:0006605, GO:0006399, GO:0006397, GO:0034641, GO:0042254, GO:0007059, GO:0065007, GO:0023052, GO:0050896, GO:0006810 (put individual system process in)

Issues to consider

  1. How will this slim be maintained to ensure it keeps in line with ontology rearrangements? Yearly revision? More frequent?
  2. What other slims should we maintain in the GO file: cellular v/s multicellular v/s multi-organism? Euk v/s prok?
  3. Mapping over different relations for map2slim - regulates?