Adding Taxon Restrictions: Difference between revisions
Line 33: | Line 33: | ||
| NCBITaxon:4952 | | NCBITaxon:4952 | ||
| Yarrowia lipolytica | | Yarrowia lipolytica | ||
| | | PMID:10629216 | ||
|- | |- | ||
|} | |} |
Revision as of 10:25, 24 November 2022
See Ontology_Editors_Daily_Workflow for creating branches and basic Protégé instructions.
Overview
- Taxon restrictions are managed outside of the main ontology file, in tab-separated (TSV) text files.
- There are two TSVs: one storing 'only in taxon' assertions and one storing 'never in taxon' assertions. They both have the same structure (see below).
- Each TSV file can be edited to either add or remove taxon constraints for a given GO term.
- Edited TSV files are then used by the build system to generate an OWL ontology containing taxon restriction axioms. This OWL ontology,
imports/go_taxon_constraints.owl
, is then imported into GO.
Editing taxon constraints
- The format of the TSV files is a five-column file where each line contains a GO class ID and label, an NCBITaxon ID and label, and an optional, pipe-delimited list of source xrefs:
defined_class | defined_class_label | taxon | taxon_label | source |
---|---|---|---|---|
GO:0000131 | incipient cellular bud site | NCBITaxon:4895 | Schizosaccharomyces | |
GO:0000795 | synaptonemal complex | NCBITaxon:4896 | Schizosaccharomyces pombe | |
GO:0019819 | P1 peroxisome | NCBITaxon:4952 | Yarrowia lipolytica | PMID:10629216 |
Adding a taxon constraint
- Add a line to either
src/taxon_constraints/only_in_taxon.tsv
orsrc/taxon_constraints/never_in_taxon.tsv
. - You must use a tab character to separate the columns—this is best done in a plain text editor on a new branch locally. Make sure you do not add an extra line at the end of the file.
- Note that in order to correctly insert tabs between each column of the file in GH you will need to change the dropdown at the top right of the editing window from spaces to tabs.
- You can do this work on your local branch, or edit the file directly in GH and create a new branch upon commit.
- Once you've created a new branch, you'll be taken to a pull request page in GH.
- At this point, return to your terminal window and proceed as below.
Removing a taxon constraint
- To remove a constraint, just delete the line, and regenerate the
go_taxon_constraints.owl
file (see below).
Regenerating go_taxon_constraints.owl manually
- After editing one of the taxon constraint TSVs, you need to regenerate go_taxon_constraints.owl so that the new axioms appear in GO.
- If you originally edited the tsv file on GH, migrate to the ontology directory, do a git pull from master on your local machine, and then switch to the branch that has your requested taxon constraint.
- Note that at this point, you won't see the changes you made to the tsv file on GH if you type 'git diff'. This is because the edits on GH are essentially a commit and so there is no difference between what is on your local machine and what is in GH.
~/repos/go-ontology/src/ontology(master) $ git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 5 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (5/5), done.
f20f612..5a26f66 ukemi-patch-2 -> origin/ukemi-patch-2
~/repos/go-ontology/src/ontology(master) $ git checkout ukemi-patch-2
Branch ukemi-patch-2 set up to track remote branch ukemi-patch-2 from origin.
Switched to a new branch 'ukemi-patch-2'
- Start Docker
- Then, from within the
ontology
directory, run:./run.sh make imports/go_taxon_constraints.owl
- The script will update the following files:
src/ontology/imports/go_taxon_constraints.owl
and/or/src/taxon_constraints/only_in_taxon.ofn
and/or/src/taxon_constraints/never_in_taxon.ofn
, depending on which file(s) was edited. - You should see stanzas representing showing the input and output files for generating the new constraints. For example:
real 0m0.015s user 0m0.000s sys 0m0.000s JAVA_OPTS=-Xmx8G dosdp-tools generate --infile=../taxon_constraints/only_in_taxon.tsv --obo-prefixes=true --template=../taxon_constraints/only_in_taxon.yaml --outfile=../taxon_constraints/only_in_taxon.ofn
real 0m6.932s user0m10.940s sys 0m1.480s ROBOT_JAVA_ARGS=-Xmx8G robot merge -i ../taxon_constraints/only_in_taxon.ofn -i ../taxon_constraints/never_in_taxon.ofn annotate -O http://purl.obolibrary.org/obo/go/imports/go_taxon_constraints.owl -V http://purl.obolibrary.org/obo/go/releases/`date +%Y-%m-%d`/imports/go_taxon_constraints.owl -o imports/go_taxon_constraints.owl
- Type
git diff
in the terminal window to see the changes made to the.owl
and the corresponding.ofn
files.
- Commit each file using either the full path name or by switching to the appropriate directory in which each file resides and then committing. For example:
src/ontology/imports/go_taxon_constraints.owl
src/taxon_constraints/only_in_taxon.ofn
See Ontology_Editors_Daily_Workflow for commit, push and merge instructions.
Checking existing taxon constraints
If a taxon constraint is requested to be added to the ontology, you can use the reasoner to check whether the term can already be inferred to have that constraint (either through constraints asserted in GO or via imported ontologies such as Uberon). In Protege, turn on the reasoner (ELK) and go to the DL Query tab. For a requested 'only in taxon' constraint (e.g. Eukaryota) and GO term (e.g. 'CAM photosynthesis'), do a DL query for "Equivalent classes":
'CAM photosynthesis' and 'in taxon' some Eukaryota
Since this term already has the constraint 'Viridiplantae', you will see that this expression is equivalent to 'CAM photosynthesis' itself. So there is no need to add an only in taxon constraint to 'Eukaryota'.
On the other hand, if someone requested that this be restricted to 'Embryophyta', the equivalent class query for 'CAM photosynthesis' and 'in taxon' some Embryophyta
would have no results in the DL query panel. So adding that restriction to 'CAM photosynthesis' would be new information.
Note that your DL query must use 'in taxon', not 'only in taxon'.
Querying the classification for 'never in taxon' inference use the exact same kind of query. However, instead of looking for equivalence to the term, you check that the expression is equivalent to owl:Nothing
. For a 'never in taxon' request for 'adiponectin binding' in Schizosaccharomyces, do this DL query:
'adiponectin binding' and 'in taxon' some Schizosaccharomyces
Under 'Equivalent classes' you will see that the expression is equivalent to owl:Nothing
. This means that this restriction is already in the ontology (it comes from a never in taxon Fungi that was asserted). If no equivalent was found, then adding the never in taxon constraint would add new information.
How taxon constraints are represented in OWL
Review Status
Last reviewed: January 8, 2021