Adding Taxon Restrictions

From GO Wiki
Jump to navigation Jump to search
 See Ontology_Editors_Daily_Workflow for creating branches and basic Protégé instructions.


  • Taxon restrictions are managed outside of the main ontology file, in tab-separated (TSV) text files.
  • There are two TSVs: one storing 'only in taxon' assertions and one storing 'never in taxon' assertions. They both have the same structure (see below).
  • Each TSV file can be edited to either add or remove taxon constraints for a given GO term.
  • Edited TSV files are then used by the build system to generate an OWL ontology containing taxon restriction axioms. This OWL ontology, imports/go_taxon_constraints.owl, is then imported into GO.

Editing taxon constraints

  • The format of the TSV files is a five-column file where each line contains a GO class ID and label, an NCBITaxon ID and label, and an optional, pipe-delimited list of source xrefs:
 defined_class	 defined_class_label	        taxon	        taxon_label	        source
 GO:0000131	 incipient cellular bud site	NCBITaxon:4895	Schizosaccharomyces
 GO:0000795	 synaptonemal complex	        NCBITaxon:4896	Schizosaccharomyces pombe
 GO:0019819	 P1 peroxisome	 NCBITaxon:4952	 Yarrowia lipolytica	 PMID:10629216|PMID:14504266

Adding a taxon constraint

  • Add a line to either src/taxon_constraints/only_in_taxon.tsv or src/taxon_constraints/never_in_taxon.tsv.
  • You must use a tab character to separate the columns—this is best done in a plain text editor on a new branch locally. Make sure you do not add an extra line at the end of the file.
    • Note that in order to correctly insert tabs between each column of the file in GH you will need to change the dropdown at the top right of the editing window from spaces to tabs.
  • You can do this work on your local branch, or edit the file directly in GH and create a new branch upon commit.
  • Once you've created a new branch, you'll be taken to a pull request page in GH.
  • At this point, return to your terminal window and proceed as below.

Removing a taxon constraint

  • To remove a constraint, just delete the line, and regenerate the go_taxon_constraints.owl file (see below).

Regenerating go_taxon_constraints.owl manually

  • After editing one of the taxon constraint TSVs, you need to regenerate go_taxon_constraints.owl so that the new axioms appear in GO.
  • If you originally edited the tsv file on GH, migrate to the ontology directory, do a git pull from master on your local machine, and then switch to the branch that has your requested taxon constraint.
  • Note that at this point, you won't see the changes you made to the tsv file on GH if you type 'git diff'. This is because the edits on GH are essentially a commit and so there is no difference between what is on your local machine and what is in GH.

~/repos/go-ontology/src/ontology(master) $ git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (5/5), done.
f20f612..5a26f66 ukemi-patch-2 -> origin/ukemi-patch-2
~/repos/go-ontology/src/ontology(master) $ git checkout ukemi-patch-2
Branch ukemi-patch-2 set up to track remote branch ukemi-patch-2 from origin.
Switched to a new branch 'ukemi-patch-2'

  • Start Docker
  • Then, from within the ontology directory, run: ./ make imports/go_taxon_constraints.owl
  • The script will update the following files: src/ontology/imports/go_taxon_constraints.owl and/or /src/taxon_constraints/only_in_taxon.ofn and/or /src/taxon_constraints/never_in_taxon.ofn, depending on which file(s) was edited.
  • You should see stanzas representing showing the input and output files for generating the new constraints. For example:
 real	0m0.015s
user	0m0.000s
sys	0m0.000s
JAVA_OPTS=-Xmx8G dosdp-tools generate --infile=../taxon_constraints/only_in_taxon.tsv --obo-prefixes=true --template=../taxon_constraints/only_in_taxon.yaml --outfile=../taxon_constraints/only_in_taxon.ofn
real	0m6.932s
sys	0m1.480s
ROBOT_JAVA_ARGS=-Xmx8G robot merge -i ../taxon_constraints/only_in_taxon.ofn -i ../taxon_constraints/never_in_taxon.ofn annotate -O -V`date +%Y-%m-%d`/imports/go_taxon_constraints.owl -o imports/go_taxon_constraints.owl
  • Type git diff in the terminal window to see the changes made to the .owl and the corresponding .ofn files.
  • Commit each file using either the full path name or by switching to the appropriate directory in which each file resides and then committing. For example:
    • src/ontology/imports/go_taxon_constraints.owl
    • src/taxon_constraints/only_in_taxon.ofn
 See Ontology_Editors_Daily_Workflow for commit, push and merge instructions.

Checking existing taxon constraints

If a taxon constraint is requested to be added to the ontology, you can use the reasoner to check whether the term can already be inferred to have that constraint (either through constraints asserted in GO or via imported ontologies such as Uberon). In Protege, turn on the reasoner (ELK) and go to the DL Query tab. For a requested 'only in taxon' constraint (e.g. Eukaryota) and GO term (e.g. 'CAM photosynthesis'), do a DL query for "Equivalent classes":

 'CAM photosynthesis' and 'in taxon' some Eukaryota

Since this term already has the constraint 'Viridiplantae', you will see that this expression is equivalent to 'CAM photosynthesis' itself. So there is no need to add an only in taxon constraint to 'Eukaryota'.

On the other hand, if someone requested that this be restricted to 'Embryophyta', the equivalent class query for 'CAM photosynthesis' and 'in taxon' some Embryophyta would have no results in the DL query panel. So adding that restriction to 'CAM photosynthesis' would be new information.

Note that your DL query must use 'in taxon', not 'only in taxon'.

Querying the classification for 'never in taxon' inference use the exact same kind of query. However, instead of looking for equivalence to the term, you check that the expression is equivalent to owl:Nothing. For a 'never in taxon' request for 'adiponectin binding' in Schizosaccharomyces, do this DL query:

 'adiponectin binding' and 'in taxon' some Schizosaccharomyces

Under 'Equivalent classes' you will see that the expression is equivalent to owl:Nothing. This means that this restriction is already in the ontology (it comes from a never in taxon Fungi that was asserted). If no equivalent was found, then adding the never in taxon constraint would add new information.

How taxon constraints are represented in OWL

Review Status

Last reviewed: January 8, 2021

Back to: Editing the Ontology