Ontology meeting 2013-10-17
Attendees: Paola, Jane, Becky, Chris, David H, David OS, Harold, Tanya, Judy
Minutes: Paola
Technical Infrastructure Updates
- oboformat parser being directly added to owlapi - means can load obo directly from P4 in future
owlapi will save in obo format as well. multiple formats are not applicable yet, we're still limited by OE, but there may be workarounds.
- x-taxon-importer refactored to use ncbitaxon_import - makes go-plus more tractable. What do we do about docs? - http://beta.geneontology.org/page/download-ontology
Shall we write more about go-plus for documentation for users, to put in the new GO site? Not many external users know about taxon stuff, but we may still want to say something.
- new beta of lego plug with MinimalModelGeneration feature - temporary link for beta - http://geneontology.org//experimental/lego/tmp/
People are welcome to try it, David H says it's really cool.
Long term plan for GOCHE terms
What's the plan for the GOCHE terms that are xps between ChEBI chemicals and ChEBI roles?
ChEBI will always have a separate role hierarchy, but we can't say that the input is_a role... We don't always want to infer that the structure always has that role. Do we keep those externally in our ontology? Shall we ask ChEBI - would they be keen on having cross-products to GO terms? Jane says that Janna might be interested.
Judy mentions the 'exposure' project/people - might be worth looking into how/what they've done?
Chris suggests to come back to this while having a Protege session.
Regulation of timing
As per email thread started by Chris.
Regulation of timing is tricky - here's an example [Term] id: GO:0002085 name: inhibition of neuroepithelial cell differentiation namespace: biological_process def: "Any process that prevents the activation of neuroepithelial cell differentiation. Neuroepithelial cell differentiation is the process in which epiblast cells acquire specialized features of neuroepithelial cells." [GOC:dph, PMID:16678814] synonym: "negative regulation of neural plate formation" NARROW [GOC:dph, GOC:tb] synonym: "repression of premature neural plate formation" NARROW [GOC:dph, GOC:tairtb] is_a: GO:0022603 ! regulation of anatomical structure morphogenesis is_a: GO:0030857 {is_inferred="true"} ! negative regulation of epithelial cell differentiation is_a: GO:0045995 ! regulation of embryonic development ***is_a: GO:0048505 ! regulation of timing of cell differentiation intersection_of: GO:0065007 ! biological regulation intersection_of: negatively_regulates GO:0060563 ! neuroepithelial cell differentiation relationship: negatively_regulates GO:0060563 ! neuroepithelial cell differentiation here is the hierarchy: is_a GO:0050793 ! regulation of developmental process is_a GO:0040034 ! regulation of development, heterochronic *** is_a GO:0045961 ! negative regulation of development, heterochronic is_a GO:0045962 ! positive regulation of development, heterochronic is_a GO:0048504 ! regulation of timing of organ formation is_a GO:0048505 ! regulation of timing of cell differentiation is_a GO:0048506 ! regulation of timing of meristematic phase transition is_a GO:0090444 ! regulation of nematode larval development, heterochronic how should these be defined? At first it sounds like qualities but this doesn't seem to be the case (not in the RoBQ branch). GO:0040034 ! regulation of development, heterochronic *** [DEF: "Any process that modulates the consistent predetermined time point at which an integrated living unit or organism progresses from an initial condition to a later condition and the rate at which this time point is reached."] Seems like this is regulating the process that precedes the development of the structure? The child terms follow a different textual definition pattern: GO:0048505 ! regulation of timing of cell differentiation *** [DEF: "The process controlling the activation and/or rate at which relatively unspecialized cells acquire specialized features. Any process that modulates the rate, frequency or extent of the XXX at a consistent predetermined time point during its development."] (the XXX is left over from some scripting in the distant past?) It would be good to have some examples here, perhaps with a pathway diagram.
The biology underlying these terms is that the differentiation occurs as it should, just at the wrong time.
David OS says there are related processes in invertebrates, but they wouldn't be described by the same terms. We could keep a generic term and then add specific children as needed (might be useful for Drosophila).
The regulation is not always done on the timing... So how do we define timing issues?
We need to tease out the various bits contributing to the timing.
David H suggests to create a new term for 'neg. reg. of neuroepithelial cell differentiation'.
EXISTS, TAGGED-INFERRED, NOT-ENTAILED
We need a strategy to deal/work through these. See Jane's comment in email thread "build-go-assert-inferences - Build # 51 - Successful!":
"are the EXISTS, TAGGED-INFERRED, NOT-ENTAILED now being deleted automatically? I think they should be. If we want to check the implications of edits looking through these reports a week later probably isn't a great way to do it - we should use Protege or perhaps we could have a per-commit report in Jenkins?"
Chris opposes taking them out altogether because we risk eliminating 'good stuff' as well. Jane points out that for some of the cell cycle terms, removing those links might actually cause loss of the only is_a parent until we fix the phases terms.
Bottom line is: we should always have the rule that if we remove an intersection, we should first check all the implications.
Shall we have a check in Jenkins before it removes the links? E.g. to give an error/warning if a term would be left without an is_a parent? But that's just one category of possible negative implications.
DavidOS: in the future, we need to rely more on the reasoner.
Decision: Inferred classifications no longer inferred are removed automatically, but a report is sent out listing which are being deleted. This happens at each commit (?). It is the responsibility of the person doing the commit to check that the removed classifications are expected. Note: removing at each user commit may be technically tricky in that it will require running scripts to generate a new file and then committing this.
Creating a megafile
We should separate the issue of what is needed for easy editing from what is needed by various end users. User facing files can be generated using filters if we need to remove some relations.
Decision: First step - take the cross products that are currently in the external x-product file that only refer to GO terms and move them into the main file as a first step in creating one editors' file, for example 'capable_of'. This needs agreement and announcement of new relations in the release file OR - the addition of filters to remove any relationship types we don't want to users to see yet.
Moving relationships to external classes into the main editors file requires imports, otherwise the the editors file has massive numbers of dangling relations - causing serious problems for display.
Chris suggests we should be able to handle this using a filtered save:
I think there's a better way to do this where we use imports. I have set up a test in the editors directory. For now do this cd editors ln -s .. go Then open go_write_test.obo Make sure you have save filters set up as in the screenshot. This will only save GO terms. A few more tweaks required. Major problem is losing imports and multiple ontology tags
David OS to experiment & report back to Chris.
single v/s multi-organism processes
We could do with xps for these. Can we say in OWL biological process has_participant exactly 1? Or has_participant >1 ?
(This was discussed - not recorded - on a previous call, but no final resolution had been made, and it was agreed to come back to it.)
- we will use PATO, I believe. See googledoc
Follow-up from last week: TG TEMPLATE FOR ORGANELLE PART (LUMEN, MEMBRANE)
Here's the discussion from last week:
"We've scheduled to have this done by the end of Q4/13.
The Jira ticket is https://www.ebi.ac.uk/panda/jira/browse/GO-185
Looking at Chris' comments in its sub-task "Verify that all xps for existing organelle part (lumen, membrane) terms are in place" (https://www.ebi.ac.uk/panda/jira/browse/GO-186), what do we need to do here?
Need to be sure that all inferences work using part_of"
Where are we with this?
Deferred to next meeting. In meantime, Chris and DavidOS need to discuss potential patterns for this. Present at next meeting.