Sequence Ontology 2010

From GO Wiki
Jump to navigation Jump to search

Sequence Ontology December 2010

Staff working on GOC tasks Karen Eilbeck 0.6

Ontology Development Number of terms in SO: 1765

  • 87 % of terms are defined, 1760 have at least 1 synonym.
  • 218 terms are internal cross products.
  • 37 term request tracker items submitted since the beginning of the year.
  • The ontology is is_a complete. A release is made of the ontology every 2 months and daily revisions are checked into CVS.
  • The SO has continued to grow and develop with the considerable input from external experts.

Significant changes:

  • Topological Relations added to SO as described in the recent Journal of Biomedical Informatics paper
  • Core promoter elements have been added, and the promoter, transcription factor binding site, regulatory region (transcription related) terms have been improved after work with Karen Christie and David Hill.
  • Variant terms have been improved, added to in collaboration with the EBI and Mike Bada. This work has been published in Genome Biology, and presented at the AIMM mutation conference.


  • Royal Society of Chemistry. Colin Batchelor PhD. continues to work with the SO and has given several presentations on this work, including to the OBO foundry workshops.
  • RNA Ontology. A paper describing the RNA ontology (and the interface with SO) has been accepted by the journal Applied Ontology.
  • Genome Variant Format Collaboration. The SO has driven a collaboration between several groups to develop a GFF3 based, ontology typed file format to describe sequence variant annotation. The other groups involved are the EBI variant pipeline group (Paul Flicek and Fiona Cunningham), The Ontario Institute for Cancer Research (Lincoln Stein), Omicia – a personal Genome company (Martin Reese) and the labs of Gabor Marth at Boston College and Mark Yandell at the University of Utah.
  • Formal Ontology development. So has been working with Mike Bada at the University of Colorado at Denver, to formally describe sequence variant, with respect to the BFO. A preliminary version of this work was presented at the AIMM workshop at the European Conference of Computational Biology, in Belgium in September.

Papers and Presentations GOCwiki:Publications, Talks, Posters

Other developments

Resource development

  • GVF – An ontology typed variant annotation format based on the GMOD model GFF3.
  • 10Gen dataset – SO is providing a set of 10 personal genomes, annotated using GVF representing data from several different sequencing platforms and different population groups, or software development, testing and for benchmarking the performance of algorithms and tool sets.