Merging Ontology Terms

From GO Wiki
Jump to navigation Jump to search
 See Ontology_Editors_Daily_Workflow for creating branches and basic Protégé instructions.


Term merges principles

Principles_for_merging_terms

Evaluate the potential consequences of the merge

Before performing a merge, make sure that you know all of the consequences that the merge will cause. In particular, be sure to look at child terms and any other terms that refer to the ‘deprecated’ term. In many cases a simple merge of two terms is not sufficient because it will result in equivalent classes for child terms. For example if deprecated term X is going to be merged into target term Y and ‘regulation of X’ and ‘regulation of Y’ terms exist, then you will need to merge the regulation terms in addition to the primary terms. You will also need to edit any terms that refer to the deprecated term to be sure that the names and definitions are consistent.

Choosing the Winning Term

Any of the merged term labels can become the primary ID. Often, one of the terms will have a better class label, a better definition, or be placed more correctly in the ontology. That ID should be favored as primary label. One can also consider how long a term has existed to choose the primary ID (the older is given preference), or the number of annotations associated to either term (the term with the most annotations is given preference).

Procedure for merging terms

  1. Find the ID of the term into which the deprecated term will be merged and navigate to that ‘winning’ term using the Search box. Copy the ID of the winning term somewhere, so you know which id to keep.
  2. For the term being deprecated, click on the Class Usage tab to see all uses of the term throughout the ontology. If the term is used in other classes, e.g. as part of an equivalence axiom, you must first deal with those other classes (maybe with merges, maybe by renaming, maybe moving parents, etc.).
  3. Remove annotations from the deprecated terms
    • Navigate to the term to be deprecated.
    • Remove the logical definition by clicking on the x on the right.
    • Remove any remaining subclasses by clicking on the x on the right.
    • Look at the definition; if it does not seem relevant, remove it by clicking on the x on the right; otherwise copy/paste it somewhere to refer to when reviewing the definition for the winning term.
    • Note down the created_by and created_date (there can only be one value per term for each of these fields; this will be useful if you need to pick one after the merge is done).
    • Check existing list of synonyms to see if they need to be moved to the new term, otherwise delete them by clicking on the x on the right.
    • Change the ID of the term to be deprecated to the winning term’s ID
      • In the term to be deprecated, click on Refactor > Rename entity’ in the Protege menu (shortcut: command-U)
      • Copy the ID of the winning term (obtained in Step 1).
      • Be sure to use the underscore _ in the identifier instead of the colon :, for example: GO_1234567.
      • Make sure that the ‘change all entities with this URI’ box is checked.
    • Make the deprecated ID an ‘alternative ID’
      • Navigate to the winning term. In the Annotations box, locate the ID of the deprecated term. Click the o to change the ID type.
      • In the resulting pop-up window, making sure the ‘Literal’ tab is selected in the top right side box, select has_alternative_id from the list on the left side. Double check that the entry corresponds to the GO ID of the deprecated term.
      • Click ‘OK’. The deprecated term identifier should now have the label has_alternative_id instead of id.
  4. Change deprecated term label to a synonym
      • In the annotations box of the winning term there are now two terms with labels ‘rdfs:label’. Click the o to change the label of the deprecated term.
      • In the resulting pop-up window, select the appropriate synonym label from the list on the left:
        • has_broad_synonym
        • has_exact_synonym
        • has_narrow_synonym
        • has_related_synonym (if unsure, this is the safest choice)
  5. Fix synonyms: In the annotations box of the winning term, check the list of synonyms to see if they are all still appropriate.
  6. If needed, fix the definition, using information from the deprecated term as appropriate.
  7. Synchronize the reasoner and make sure there are no terms that have identical definitions as a result of the merge. These are displayed with an ‘equivalent’ sign ≡ in the class hierarchy view on the left hand panel.
  8. Save changes.

Troubleshooting: Travis/Jenkins errors

  • Merging a term that is used as ‘replaced by’ for an obsolete term:
    :: ERROR: ID-mentioned-twice:: GO:0048126 
      GO:0030722 :: ERROR: has-definition: missing definition for id

The cause of this error is that Term A (GO:0048126) was obsoleted and had replace by Term B (GO:0030722). The GO editor tried to merge Term B into a third term term C (GO:0007312). The Jenkins checkk failed because ‘Term A replaced by’ was an alternative_id rather than by a main_id. Solution: In the ontology, go to the obsolete term A and replace the Term B by term C to have a primary ID as the replace_by.

 See Ontology_Editors_Daily_Workflow for commit, push and merge instructions.

Review Status

Last reviewed: Aug 6, 2018

Back to: Editing the Ontology