Ontology meeting 2015-03-26
Attendees: Paola, Harold, DavidH, Tanya, Heiko, Chris, PaulT, Judy
- 1 Follow-ups on annotation extensions
- 2 Bad inferences to catabolic process? (DavidH)
- 3 Protein Family binding term creation vs. on the fly binning (DavidH)
- 4 New Protege release
- 5 Follow-up: Acids and bases in GO
- 6 results_in strategy
- 7 long term hosting of ontology project and issue tracker
- 8 Proposed new TG template: channel activity
Follow-ups on annotation extensions
Some were discussed at the annotation call on Tuesday. David H to report.
Good news: great consistency with general annotations some differences in the annotation extensions (BFA1 acts as an inhibitor in the absence of BUB2, acts as an activator in the presence of BUB2) Reinstate in_absence_of, in_presence_of? This paper is a good example of the need for these two. Also going to try to model this in Noctua.
Bad inferences to catabolic process? (DavidH)
I 'fixed' all the NAD/glycolysis ones with a total hack. I made the NAD(P) and all of its children participants instead of inputs. So the rationale is that the only thing that can be input into a catabolic process is the thing being catabolized. However, in the long run I think this strategy will end up overloading the participants in a process. At the end of the day, I think we really need to create is_catabolized and is_synthesized as subtypes of input and output.
Create redox processes? Pushes the problem one step up. Has_coenzyme_input, has_coenzyme_output. Doesn't work for ATP/ADP. Need more specific relations for chemical inputs/outputs. Chris will take care of finding the relations in RO (or elsewhere), adding them and doing the search/replace of the problematic relations behind the scenes. DH will open a SF ticket for this task so that it won't slip through the cracks.
Protein Family binding term creation vs. on the fly binning (DavidH)
I have done an experiment looking at 'x family binding' annotations in MGI versus looking at the total number of annotations to 'protein binding' and its children using Panther families as a grouping. There are a significant number of annotations to 'protein binding' that have a value in the 'with' field that would allow them to be grouped in a family.
selected two protein families neurexin family binding --> PANTHER10127 ---> MGI genes in this family hedgehog family binding --> PANTHER11889 ---> MGI genes in this family looked for these ids in the with column could deepen generic 'protein binding' annotations by looking at the with field, looking at the PTHR family they belong to, and then assigning a more specific term next, did VLAD analysis with spiked (deepened) annotations vs. not spiked with Term Enrichment analysis, enrichment gets better (more significant) using the calculated, deepened annotations idea: generate the family binding terms on the fly
New Protege release
Please test and give feedback (or plan to do so by the next call if possible). For details, see Chris' email 'New Build of Protege 5.0.0 Beta available' dated March 11th.
So far, biggest bug is that it crashes. Please try and report the crash if it happens to you.
Follow-up: Acids and bases in GO
Copying from two weeks ago - any action item left for this? Documentation?
The solution we've adopted so far is proving unsatisfactory specifically with respect to NH4+, NH3 and NH2- conjugate-acid-base relations. We need to think of an alternative, with or without the direct involvement of ChEBI.
Does NH2- exist in biology (azanide ion) (Harold: I can't find; it is really unstable; reacts with water quickly to form NH3 + OH-; Azanides as a class: Chebi has; these aren't like natural metabilites More like drugs, etc. Option: exception to the GCI generation rule (equivalence is transitive so we have to do this): for azanide; We should document this somewhere very well for all times we have to do this.
Update from Heiko (March 10th): "as discussed on the call, we have modified the generator for bio-chebi. It now supports exceptions for the generation of the GCI axioms, i.e. ammonia and it's conj base/acids. The current approach to define these exception is a new subset in the seed file for bio-chebi generation called no_conj_equiv. The bio-chebi.owl update file has been committed today. "
Any actions? Yes, need to remove the asserted links. Paola's SF item to do this.
See emails from Chris (threads on go-ontology 'results_in outliers' and 'results_in refactor'). Chris asked for feedback on his proposal. Discuss, then reply to email thread.
results_in refactor: Looks OK, but has many very indirect cases.
(Note that, since generation of Chris' list above, we've approved a couple of terms created with the old reg_by_reg TG template)
Have added at least three terms with this relation since last discussion but solution has not been finalized and implemented. Need to revisit the email thread. Punted to next week, Chris will try to remember.
long term hosting of ontology project and issue tracker
- Should we migrate our ticket archive? https://trello.com/c/nf1HDHwv/149-evaluate-feasibility-of-migrating-tickets-from-sourceforge
- If we migrate to github, should we migrate files too, so files and tracker are co-hosted and can be mutually referred to
- here is an example of how this works: https://github.com/obophenotype/uberon/issues/585
Planning for SF to disappear at some point. Makes sense to migrate to github as this is gaining dominance and people are moving away from it. There is a way to manually backup the issues, make a rota to do this. All software related issue tracking is done through github. Advantages: if version control system is also in github, then changes in the ontology can be auto-linked to the tracker Disadvantages: doesn't handle big files well, gafs may be a problem. VCS is more complex than SVN. Possibility: Ontology in github and other files in svn.
Proposed new TG template: channel activity
I (Paola) couldn't find it listed in Jira. Existing terms are already axiomatized.
Next annotation call: changes in annotation of channel activity and child terms. When we make more specific terms, this will be necessary. Paola to edit new term definition (SF request) to reflect specificity.