Ontology meeting 2015-04-23
Attendees: Paola, David OS, Harold, Heiko, Tanya, Chris
Minutes: Paola
Please, Please, Please could we propagate from EC to subclass axioms on public release
We still need to redundantly repeat everything we say in Equivalent Class axioms in subclassing axioms in order to get things to appear in GO releases. While editing in Protege, it is almost inevitable that these are missed some times. This is a waste of everyone's time. e.g. see" https://sourceforge.net/p/geneontology/ontology-requests/11646/
If we need to keep some types of relationships out of the release file, can we please control this via a filtering mechanism at release. Relying on editors to police this via what goes in subclassing axioms is incompatible with sensible use of logic for autoclassification (sometimes subclassing axioms are needed for classification, but equivalent class axioms would be wrong).
Response from Chris and Heiko: We were surprised that this was the behavior, but it's not surprising that we were surprised as our build pipeline has gotten over-complex. Heiko and I spent the day untangling things. The basic summary is: - The assert-inferences operation is now run with every commit in build-go, but it does *NOT* write back to editors, only downstream products (validated.obo and it's derivatives such as go.obo) - the http://build.berkeleybop.org/job/build-go-assert-inferences/ job and it's report counterpart will continue to run as normal. This is the one that writes back to the editors file - one upshot of this is that users of go may see inferences *in advance* of editors working on OE mode. I think this is fine - we actually have an extra level of checking now. As can be seen, build-go is failing due to the isa/partof issue I mentioned previously. I will make a temporary fix for this and we can discuss the temporary fix in the meeting Longer term we aim to simplify things. I would like to consolidate on a single consistent series of operations, where these are run in dry mode for every commit, and in release mode with every release
So, done, see Chris’ notes above. Didn’t look at redundancies at this stage.
ORE 2015
We should enter GO for this year's OWL reasoner evaluation competition. DOS: I'll enter go-plus.owl, but would also like to enter this combined with an annotation set expressed as OWL along with some queries across data. Can we get this ready for May 1st?
Yes. Chris: you may use gaf2owl. Test unexpanded file (wrt never in taxon). AI: Heiko and Chris will point DOS to the best files to use.
Any ideas for queries across data? Something involving e.g. annotation extensions with CL? AI: Chris will share files will DOS.
Regulation of homeostasis
Stemming from terms in the TG queue e.g. 'regulation of triglyceride homeostasis'.
'homeostatic process' is_a 'regulation of biological quality'. I was under the impression that we wouldn't want to create further regulation terms for homeostasis (reg of reg?). But I see that we have 'regulation of homeostatic process' with children and descendants. I'm sure we've discussed this before...? Should 'regulation of biological process' and 'regulation of biological quality' be disjoint??
(dph-the danger here is that regulating a process can regulate a quality. For example, 'modification of morphology or physiology of other organism' has an is_a child 'modulation of blood coagulation in other organism'. If you do the experiment in protege, you get a lot of nothing.)
And looking at direct children of 'biological regulation', why are 'regulation of exosomal secretion' and 'regulation of synaptic transmission' sitting straight there, rather than under 'regulation of biological process'?
(dph- These are asserted based on the need to assert a genus when editing in OE. They probably should have been RoBP. If you remove them the reasoner finds much more sensible parents. This is related to the first topic today. The genus of the cross-product definition is not necessarily the best genus for a term, the reasoner finds that. DOS: Best way to find a reasonable parent to assert is to check what the reasoner finds. Then add and tag it as entailed. Will cease to be a problem if we can drop the pseudo is_a complete rule that ignores genus in equivalent class expressions)
Homeostasis here represents an ensemble of processes that we may want to disentangle at some point. What we mean by its ‘regulation’ has to do with levels of substances (in the bloodstream in this TG requests’ case). We’d like to express this ‘regulation of homeostasis’ better than we do currently. We may want to refer to a textbook case of homeostasis in the blood (e.g. sugar) and see how we could better represent the whole process.
DOS: I think this comes down to a question of what is the clearest way to represent and distinguish: 'regulation of X level' (insulin and glucagon regulate blood sugar levels) vs 'regulation of homestatic X level' <- Regulation of the levels of X that homeostatic mechnisms aim for. The former can be part of a homestatic mechanism. Is the best way to represent the latter as 'regulation of homoestatis'?
AI: Paola will leave terms in queue for now, discuss with submitter (Rachael), then get back to group.
AI: David OS will look into placement of 'regulation of exosomal secretion' and 'regulation of synaptic transmission'.
GitHub Latest (bumped to next week)
Discuss: https://docs.google.com/document/d/1iyVY8kDBJIEydoWFLG9j5BoO4VmdIXES95fVhIkFXfk/edit#
Chris went over this briefly, and some of us already commented on the doc, but we bumped discussion to next week when more people are on the call. DOS: Two potential issues we need to be careful of: - file sizes are not yet a problem, but only because we still have an OBO master. - Need to come up with a set of labels for priority, close status etc as GH doesn't have independent fields for these
Follow-up: Prefixing of obsolete to label of all obsoletes
Last week we discussed and resolved:
We should really do this for consistency. Or alternately remove from the release version since they are primarily there for the edit cycle.
Consensus is we should be consistent and just prepend all the obsolete terms with obsolete-. The tag will still remain.
Are we going to retrofit and do this (automatically) for existing obsolete terms too? (We should, but can't remember.) Currently, obsolete term names come in 4 flavors, we need consistency:
- term name (nearly all cases)
- term name prefixed with 'obsolete' e.g. 'obsolete ATP catabolic process' (27 examples as of today)
- term name prefixed with 'OBSOLETE' (only one example)
- term name prefixed with 'OBSOLETE:' (only one example)
For new (future) obsoletions: Protege will do this automatically, OE won’t. So, AI for all, a reminder to make sure that the prefix ‘obsolete’ is appended.
AI: For pre-existing terms, Chris will automate this, and use the form ‘obsolete’.
AI: Heiko will look into making things consistent in TG.
Axiomatizing Regulation of Biological Quality
Any objections to Chris' suggestion of going ahead and adding?
DOS: Mostly looks good. Not sure about location pattern though. occurs_in definition has domain process (although this is not formalised).
DPH: inheres_in?
Details on email thread with same title on ontology list:
David H and I have gone through the proposed logical defs for RoBQs
http://wiki.geneontology.org/index.php/Extensions/x-attribute
(the wiki is v out of date, but these extension pages can soon disappear once is everything is part of the normal edit cycle)
These make use of OBA, which you can find out more about here:
https://github.com/obophenotype/bio-attribute-ontology/
including links to the OBA TG etc
I think they are now all valid (completeness can wait). I propose just going ahead and adding, raise any objections on thursday call
(Transmembrane) transport templates in TG
See email thread "Transmembrane transport again..." started by Paola on April 10th. Please comment.
Generic pattern in ontology:
import/export
and ('has target start location' some A) and ('has target end location' some B) and (transports some C) and ('results_in_transport_across some D)
Textual def or comment should mention what doesn't count as transmembrane.
AI: DOS to Make design pattern doc + JIRA ticket for this. Point Heiko to both. [Update: here's the Jira ticket: https://www.ebi.ac.uk/panda/jira/browse/GO-348]
New TG templates
Punted from two weeks ago:
The job of editors on SF and TG duties would benefit considerably if we could implement some TG functionalities we've already sort of agreed upon. We may want to revisit the related requests, prioritize, see what's missing.
1) TG: create MF-BP links when appropriate https://www.ebi.ac.uk/panda/jira/browse/GO-199
2) Create TermGenie templates with UBERON https://www.ebi.ac.uk/panda/jira/browse/GO-168 - follow-up
Last week we wrote:
May be done already. Chris and Heiko will look into this. Something is still missing. There was a problem with propagation over 'part_of' in the anatomy ontology and that extending into the GO ontology, development -- morphogenesis. May be not urgent at this time. Wait for requests to come in and then roll out the template. Code is mostly in place but not switched on. Need to deal with the part_of thing before rolling out.
A non-development request to add a part_of link that could have been inferred automatically: https://sourceforge.net/p/geneontology/ontology-requests/11635/
3) TG template for 'cellular component organization' https://www.ebi.ac.uk/panda/jira/browse/GO-327
4) TG template for 'cellular component binding' https://www.ebi.ac.uk/panda/jira/browse/GO-326
5) Create term genie template for response to organism https://www.ebi.ac.uk/panda/jira/browse/GO-212
GO-UBERON issues
Can we discuss Stan's hindgut issue and the issue of multi-species fuzziness in general?
https://github.com/obophenotype/uberon/issues/689
Potential solutions: 1. Split into ectodermal (arthro) vs endodermal 2. Redefine to ectodermal and re-annotate relatively small number of vertebtrate annotations - but to where?
AI: punt decision to next week when we can get DH's input.
GO-SO issues
There are various problems with our use of SO, some of which requires co-ordination with SO dev:
- We need a bridge from SO transcript terms -> ChEBI:RNA. In the absence of this, lots of inference is mising. Will the long planned SO molecular save us, or do we need our own bridge axoims?
- We use the SO terms nRNA, ncRNA and its children as if they refer to both mature and immature states of transcripts. In fact, according to SO they refer to the mature state. To align with SO properly we would need to review usage and use alternative SO terms where available. SO has an additional set of terms for primary transcripts, but no terms for immature. Primary transcript refers only to before splicing so no terms for intermediate state after splcing and before other modifications involved in maturation such as capping and polyadenylation for mRNA. Need to discuss possibilities of adding these with SO.