Ontology Development Progress Report December 2008

From GO Wiki
Jump to: navigation, search

Ontology Development

Metrics

GO term statistics

December 1, 2007

Current Defined Obsolete Total
Function 7986 7605 559 8545
Process 14307 14145 460 14767
Component 2046 2046 114 2160
All 24339 24929 1133 25472


November 30, 2008

Current Defined Obsolete Total
Function 8439 8137 765 9204
Process 15723 15594 479 16202
Component 2245 2245 117 2362
All 26407 27337 1361 27768


SourceForge statistics (Dec 1, 2007 - Nov. 30, 2008)

  • items opened: 1037
  • items closed: 1062

SourceForge reports (on SF site)

Completed work

Regulation

To improve the representation of biological regulation in the GO, three new relationship types -- regulates, negatively_regulates and positively_regulates -- have been introduced in the Biological Process ontology. Prior to this change, regulation terms were related to other terms via part_of, despite the fact that regulatory processes are not necessarily integral to the processes they regulate. Using reasoning and quality control reports, we have thoroughly reviewed the relationships involving regulatory processes and their targets to ensure internal consistency. If a term 'regulation of process X' exists in the ontology, it must be a valid subtype of 'regulation of biological process', and must have a regulates relationship with 'process X' or be a valid subtype of another regulatory process. Likewise, regulation of molecular function terms have been aligned with the corresponding terms in the molecular function ontology, and we are working to align regulation of biological quality terms with PATO. Approximately 1300 relationships have been added, and 540 deleted, as part of this work; 53 new terms were also added, and many definitions and synonyms reviewed.

We have made further progress towards creating links between the three GO ontologies, beginning with Molecular Function-Biological Process links. The regulates relationships will be used in the first links between the Molecular Function and Biological Process ontologies -- we will make the implicit regulatory relationships between 'regulation of molecular function' Biological Process terms and the corresponding Molecular Function terms explicit in early February 2009. Additional Molecular Function-Biological Process links are under development, based on single-step processes (e.g. 'protein phosphorylation'), mining external pathway resources such as Reactome and MetaCyc, and curated relationships for some metabolic pathways, are in preparation.

Other completed work

  • (i.e. those that refer only to terms from one or more of the three branches of the GO) are undergoing curatorial review, and a plan for integrating them into the authoritative version of the GO has been formulated. Cross-products between GO and external ontologies such as the OBO Cell ontology are pending.
  • A content meeting on lung development was held in Boston, MA, on December 5-6, 2007. Participants added 152 new terms describing the anatomical representation of lung development to a branch version of the GO flat file. Experts in the field suggestedthat the file be expanded using more description process terms. Work on this project is ongoing.
  • Smaller-scale efforts include:

Work in progress

Ontology Quality Control

We have implemented a number of ontology quality control procedures, and more are under discussion. Several checks on text and ontology structure are built into OBO-Edit, and we also use a few custom checks. Reasoner-based checks are run periodically, external to the ontology editing cycle, and generate reports that curators use to correct errors in the ontology. At present the reasoner-based reports identify potential missing relationships, potential misplaced terms or potential term nomenclature issues.

Software and Utilities

OBO-Edit

This was a year of transition of OBO-Edit as the original developer left the project in December 2007 and his permanent replacement, Amina Abdulla, did not begin work until April 2008. (Nomi Harris worked part-time on the project from November 2007 through August 2008.) We have made considerable progress in debugging and improving the tool, and plan to make a public release of OBO-Edit 2.0 early this next year. One of the more significant achievements this year was the major refactoring of the code, splitting it into 3 parts: general utils (org.bbop), object model and API (org.obo) and GUI (org.oboedit), which has enabled these packages to be used in other applications. Below is a summary of recent improvements.

  • OBO-Edit Documentation and Installer
    • Updated user guide and wiki.
    • Installers now keep separate install histories for OE1, OE2, and Phenote, to avoid collisions.
  • OBO-Edit I/O and command-line options
    • Associated scripts (e.g. obomerge) now support both OBO_1_0 and OBO_1_2 formats (with 1_2 format being the default).
    • Can specify on the command line an obo file to load.
    • Handle dangling references correctly in both input and output. Allow dangling refs" is now true by default, so you can load ontologies that have dangling references.
    • OBO_1_2 writer now outputs dates in ISO 8601 format. Reader can handle both old and new date formats.
    • Added log4j logging to aid debugging.
  • OBO-Edit Functionality
    • Search is now about five times faster than it was.
    • New search option “Parent” allows searching for terms with a given parent.
    • Added new rule-based reasoner
    • Added GUI for selecting a reasoner.
    • Fxied multiple cross-product save through the Text Editor
    • Cross Products Tab in the Text Editor shows only user-defined relations while selecting Discriminating relations.
    • Relations section in OTE shows only user-specified relations (no built-ins).
    • Improved Tree Viewer configuration options and display.
    • Fixed autocommit for text edits.
    • Text Editor now allows assignment of Synonym Types specified through the Synonym Type Manager.
    • Updated GO dictionary and added support for user-defined domain-specific dictionaries.
    • Improved Parent Editor speed for adding/deleting parent relationships.
    • Enabled mass dbxref addition (regular or definition) to all selected terms.
    • Implemented local and global selection modes for the Graph Viewer.
    • Enabled save with reasoned results for filters requiring reasoner.
    • Fixed filtering (link and term).
    • Fixed delete/destroy behavior. Pop up appropriate warnings if a term is about to be made obsolete.
    • Fixed sub-selection behavior.
    • Title of search results window is now configurable (can include search string and/or number of matches found) and settings persist.
    • Fixed problem that made it impossible to save ontologies that generated warnings during the checks that are performed before saving.
    • Added support for 'regulates' relationships.
    • OBO-Edit now warns user if the available memory is >95% used up.
    • Made file browser for load/save open at last used directory (rather than always at your home directory).
    • Now autosave will only save if there are unsaved edits, and it won't grab the focus and pop to the front.
    • Circular relationships (e.g., disjoint_from) no longer disrupt the ontology tree editor. They are shown in gray and cannot be expanded (to prevent infinite loops).
  • OBO-Edit GUI look and feel
    • Improved the look of various components in terms of colors, styles, and layout.
    • Pop-up help windows that appear when you click the ? icon on a panel titlebar are now real windows, which can be dragged around, resized, or destroyed.
    • Changed some confusing component names.
    • Reorganized menus for ease of use.
    • Significantly rearranged configuration options and tabs in the Configuration Manager. The new layout makes the user config directory much more obvious, and is more logical in general.
    • Made TextIcons (black & white icons used in the Ontology Tree Editor for unknown relationships) readable.
    • Font is now based on user's preferred font. (Note: not yet fixed in Graph Editor.)