Ontology Development Progress Report December 2013
Ontology Development Progress Report
GOC Meeting Dec 2013
Prepared and Submitted by Jane Lomax and David Hill
- David Hill (MGI)
- Tanya Berardini (TAIR)
- Heiko Dietze (LBL)
- Harold Drabkin (MGI)
- Becky Foulger (EBI)
- Jane Lomax (EBI)
- Chris Mungall (LBL)
- David Osumi-Sutherland (EBI) (started Sept 2013)
- Paola Roncaglia (EBI)
SF items opened (SF items closed)
|Mar 2013||[data lost in tracker transition]|
|Apr 2013||26 (21)|
|May 2013||26 (21)|
|Jun 2013||41 (37)|
|Jul 2013||46 (39)|
|Aug 2013||65 (55)|
|Sept 2013||60 (46)|
|Oct 2013||83 (61)|
|Nov 2013||65 (33)|
|Dec 2013||6 (0)|
|Total 2013||418 (313)|
Total number of GO terms added Jan 2013 to Dec 2013: 1365
Total number of GO terms added manually Jan 2013 to Dec 2013: 344 Total number of GO terms added via TermGenie template Jan 2013 to Dec 2013: 803 Total number of GO terms added via TermGenie freeform Jan 2013 to Dec 2013: 218
Total number of GO terms obsoleted Jan 2013 to Dec 2013: 67
Transition to OWL
We have now completed a shift in which all ontology reasoning is performed using OWL-based reasoners run during automated jobs or manually by ontology editors. Ontology editors were trained on the use of the Protege editor during several focus group meetings to prepare for the OBO-OWL shift. Most ontology editing on a day-to-day basis is still performed using obo-edit and the ontologies are committed in obo format. Once a commit is requested the GO is automatically transferred to an OWL format and the main obo file, auxiliary cross-product files and external ontologies are checked for logical errors in the Jenkins environment. If errors are found, an automated e-mail is sent to ontology editors indicating that the ontology build has failed the checks and a report is included that indicates where the problem was encountered. Ontology editors can then fix the problem before the 'public' version of the ontology is released. In addition, ontology editors use the OWL version directly in the Protege editor to test new ideas and to further troubleshoot build failures.
As a result of this work, many of the relationships in the ontology are now automatically inferred rather than manually asserted by ontology editors. A report report listing added and deleted relationships is delivered each week to the ontology editing group as a 'sanity' check.
Since March 2013 we have added 14 new templates to our template-based term addition tool, TermGenie. These are:
chemical_transport_from_to (CHEBI) plant_development (PO) plant_formation (PO) plant_maturation (PO) plant_morphogenesis (PO) plant_structural_organozation (PO) cell_apoptosis (CL) chemical_homeostasis (CHEBI) protein_localization_to (GO BP) chemical_import_into (CHEBI) cc_assembly_diassembly (GO BP & CC) protein_complex_by_activity (GO BP & CC) single-multiple-organism_process (GO BP) cell_differentiation (CL)
These templates utilise both classes within GO and classes from external ontologies, shown in parentheses. PO = plant ontology, CL = cell ontology, CHEBI = chemicals of biological interest ontology.
Mappings to other ontologies
To take further advantage of the reasoning capabilities of OWL and to integrate the expertise of distributed resources, we have continued to work with developers from other groups to cross-reference GO terms with external ontologies.
Mapping Chemicals in GO to ChEBI
We have completed the first phase of our integration with The Chemicals of Biological Interest (ChEBI) resource. All terms that follow standard patterns and are differentiated by chemical structure are defined using references to the ChEBI resource. GO ontology editors work together with ChEBI editors to address queries about ontology alignment and to make additions to ChEBI when necessary. This work is fully described in PMID:23895341
Mapping Reactions in GO to Rhea
This year we have embarked on a process to align GO catalytic reactions with the Rhea database of enzymatic reactions (http://www.ebi.ac.uk/rhea/). This project has the ultimate aim of automatically deriving most GO catalytic reactions from Rhea, eliminating the need to curate this data in both Rhea and GO. This year we have worked on the cross-references to Rhea which are maintained in GO e.g. removing 1:many mappings and resolved some of the discrepancies between the two systems. We have also developed a pattern for the logical definitions of GO catalytic reactions in terms of the entities from the CHEBI ontology that participate in them.
Mapping cellular Components in GO to SAO
This year we completed the incorporation of the Neuroscience Information Framework's (NIF) Subcellular Anatomy Ontology (SAO) into GO cellular component. Our paper describing this project was published in Oct (PMID:24093723).
Improved Biological Representation
We are constructing a signaling curation manual to provide annotation guidelines for the signaling terms. We have had a number of calls to discuss further improvements to the existing signaling structure, particularly how to define and place the intracellular signaling modules. We are standardizing and clearing up the existing signaling pathway terms, such as interferon signaling. Signaling pathways have been used as examples for the LEGO model of annotation.
In 2013, the Apoptosis GO working group provided curators with detailed annotation guidance, in the shape of an apoptosis curation manual enriched with annotation examples (http://wiki.geneontology.org/index.php/Apoptosis_Curation_Manual). This guidance supported a significant curation effort that focused mostly on human and mouse apoptosis players and regulators. Feedback from curators was channeled into further polishing and adding to the apoptosis branch of the ontology. Ontology work was also done on cell death modalities other than apoptosis, with a focus on necroptosis as this was identified as a growing field of research. This effort included annotation of necroptosis players from recent literature. As a result of all the above, in 2013 over 5,000 manual annotations were added or revised in the branch under GO:0008219 'cell death'.
We are now reaching the end of the ontology changes for the viral processes and cellular components in GO. This work has affected 334 biological process terms and 67 cellular component terms. We are drafting a paper to be submitted to BMC Microbiology.
Metabolic Pathways (Glycolysis)
We have begun work on better modeling metabolic pathways by choosing a universal pathway that although straightforward still presents challenges for GO representation, glycolysis. Taking advantage of the knowledge in the Reactome resource, we have been working with an expert from reactome to represent the pathway in the GO framework. As a test-case for the representation of other metabolic pathways in GO, we have begun creating has_part links between the biological process and the molecular functions that are always a part of a process. We are exploring the use of these representations to create individual views that can be used in a modular annotation framework.