Ontology meeting 2014-11-20
Attendees: Paola, David H, David OS, Harold, Heiko, Tanya, Chris, Paul T
Rearranged order of tickets to reflect current work. We’ll repeat this alignment exercise in a couple of weeks (already added to agenda). Moved ‘Biochemical Pathways’ a bit further down - we’ll want to have Alan Bridge on a call when we work on that.
annotation extension relations file
How do we add new relations from RO? Is the failure of attempts to do so a problem with propagation scripts?
David OS: production of go-rel.owl is failing somehow. AI: Chris will take a look.
Update: Chris writes: (thread "fixed go-rel")
Note that documentation is now included in a description field in the ontology header:
This ontology combines RO together with GO-specific relations in the GOREL namespace, used for annotation extensions. See;
Huntley, R. P., Harris, M. a, Alam-Faruque, Y., et al (2014). A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics, 15(1), 155. doi:10.1186/1471-2105-15-155
Notes on release edition (gorel.owl):
This ontology is created by merging the relevant subset of RO together with GO-specific annotations of RO relations, and GOREL relations
Notes on editors version (gorel-edit.owl): The editors version imports ro.owl, and allows addition of new relations (GOREL ID space) and annotation of existing RO relations. Note for editors: if an RO relation is to be made visible in the final gorel product, then make the relation a SubProperty of 'go annotation extension relation'. Anything with a logical axiom is included in the final module. Consult Makefile for details"
Jane/Paola to draw up proposal & update trello https://trello.com/c/CXcsUGs3/137-extracellular-vesicle-terms
SF ticket here: https://sourceforge.net/p/geneontology/ontology-requests/11330/
'tissue' and 'epithelium' in Dicty
Does Dicty have tissues? Maybe, sort of. See http://www.ncbi.nlm.nih.gov/pubmed/22930590. Whatever the ‘precise’ answer is, from a PAINT/LEGO perspective, the key question is - is it homologous? Paul T: for now, it makes sense to define it as Petra suggests. AI: Chris and David H will work on this.
Downloads page on GO website
Some time ago (before the Barcelona meeting), we (esp. Chris) mentioned that we may want to improve/increase documentation on this page - or is it ok as is now? http://geneontology.org/page/download-ontology
Related: Susan says that "FlyBase are finally switching from gene_ontology.obo to go-basic.obo. Would it be an idea to mention gene_ontology.obo in the legacy section in case folks are confused about this change. I haven't managed to find an official announcement about these name changes in gofriends emails though I could have missed it." and "Searching the new site with gene_ontology.obo gives no hits which isn't ideal."
Chris’ answer: yes it would make sense to mention that. AI for him.
What's in the cell and what's out
Chris to report - he set up a Google doc to discuss this some time ago. Is this resolved now?
(Chris, could you post the link to the Google doc here if needed, thanks.)
AI: Chris and David OS will discuss this when they see each other in a couple of weeks or so. Also started discussion on extracellular matrix, we may need to revise placement. See https://sourceforge.net/p/geneontology/ontology-requests/9076/
Propagation of intersections to relationships
Why isn't this part of the pipeline? Is it just because we have a tradition of manually saying everything twice (once a logical def and once in a relationship)?
See https://sourceforge.net/p/geneontology/ontology-requests/11311/ for example.
Of we do automatically propagate, should it be only in the release version? Should it be limited to only some relations? (e.g. excluding GOREL relations?)
Heiko: difficulty is that currently we have checks that prevent generation of redundant relations. David OS: we may not want to clutter the editors file unnecessarily.
AI: Chris and Heiko will work on this.
Related: AI: Heiko will start a google doc about what we’d want the pipeline to look like for automated assertions.
Follow-up: Relationships (Annotation extensions)
Where are we with this: "We will have the has_input meeting next Monday [i.e. 3 days ago]. Once we are done, we need to go back to the annotators and report about which will be valid for use. Will the RO definitions be of use for annotators? We need a person from the annotation group and the ontology group to get together to make sure both are the same, David OS and Ruth."
Update, sent to whole GOC:
"The ontology group have been working through the list of AE relations as discussed at the GOC meeting. We'll now have follow-up meetings with some of the annotators and work together to produce some definitive documentation.
In the meantime though, there are a set of relations we are recommending be deprecated, we suggest you avoid using these:
X in_absence_of (Use instead inhibited_by or negatively_regulates or don't capture)
X in_presence_of (Use instead activated_by or positively_regulates or don't capture)
Next meeting will be Dec 8.
AI: Paul T will link LEGO examples, but Noctua is down right now. AI: Heiko and Chris to look into this.
Ontology term for protein family / protein domain (family) (bumped from last week)
Tony S is redoing imports for QuickGO and trying to be as ontology compliant as possible in how things are typed for import (JIRA ticket here: https://www.ebi.ac.uk/panda/jira/browse/GO-322)
I've pursuaded him that in most cases, he should be able to use an existing foundry ontology class. But there are still a few cases where we don't have an obvious term. In particular, what do Panther IDs and InterPro IDs refer to?
It is not particularly useful to have them refer to proteins and protein domains respectively - as Tony wants to distinguish from a protein in a specific species. If we say that Panther IDs refer to classes of protein, then we would type using an ID for protein. But perhaps it is better to treat protein family as some kind of information artefact - or in OWL terms as a individual with members rather than a class?
AI from last week: Judy, Chris and Cathy will talk about this and coordinate once Chris gets back.
AI: David OS, Chris, Heiko and Tony will have a separate call to look at this.
Links to documentation (bumped from last week)
[Copying from last week - should we follow up on decisions below now, or bump to 1-2 weeks in the future?]
Following old discussion and recent email thread, in search of a more robust and logic-oriented approach than the one currently adopted in QuickGO.
See email thread 'Rough notes for OG from PAINT call'. Summary:
"The goal is to have a procedure that runs as part of the release process that propagates the URLs (or any annotation property) down the hierarchy based on a structured query (OPPL, SPARQL etc)
DavidOS - do you think this is possible based on your axiomatization for membrane proteins
Others - can we axiomatize the production terms to avoid dependence on regexs?"
DECISIONS: - Propagation rules will go in an OWL file - using DL or SPARQL or some combination (rather than hacking something with xref_abbs) - We should investigate using OWL image annotation system developed at VFB for OWL formalisation - *could* use super-relations in order to do 'unions' (e.g. IPR) - Chris notes - may also be an option for linking to LEGO exemplars for some terms. - Propagation would then happen at build time so the public would see propagated in dbxrefs.
AI: David OS will build a couple of tests using the membrane protein examples.
Action items for ontology team from GOC meeting
(FYI, minutes are at https://docs.google.com/document/d/1NonH97s8xEpDdx6DfonKPKI_RdHbbG-yft85UZtUmF0/edit#)
There are several AI for Chris (sometimes along with others outside the editors group). Wrt AIs for the whole team, most have already been put down in Trello or SF. Remaining ones afaics:
Action: Heiko: TG form Change ‘add template’ to ‘Use template’ - DONE
Action : Ontology group look into creating intermediate templates for complexes
- GO annotations for IntAct complexes:
Action: Harold: GO, IntAct, and PRO need to coordinate better so that they are aware of the new and planned developments on each project, to ensure there is no duplicated effort.
- Readability of term labels:
Chris’s 3 proposals based on David’s slides: If TG detects the term is overly composed (automatically detected based on a profile of ‘overly composed’), the requester is forced to enter an example of usage in the comment (ie a free text description of their annotation) Pascale: The automated check, based on term string length (100 characters ?) and other characteristics, would trigger a mandatory comment. In practice we recommend ALL new GO terms should have at least ONE example of a correct annotation (PubMed ID + figure included). That would help understand what the term means, and perhaps help people annotate to an already existing term. A sub-editor can tag a term as being overly composed will be auto-unfolded and shown in a decomposed fashion in AmiGO If a term is is auto-detected as being overly composed the curator is forced to enter a Noctua model ID before the term is generated Guidelines: If there is a commonly used name then that is the first choice for a term/class label. The labels of the parental terms may be helpful, but there is no requirement to reiterate the parental labels in the labels of the child classes.
Complicated terms (e.g. positive regulation of X involved in Y during Z) - editors should tell curators to annotate to individual terms and handle it later. Add a new tag called LEGO. Curator should make a LEGO model in Noctua.
Could people please read the above and make a note of any AI for themselves. Thanks!