Ontology meeting 2015-03-26

From GO Wiki
Revision as of 12:37, 26 March 2015 by David (talk | contribs) (→‎Follow-ups on annotation extensions)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Attendees: Paola, Harold, DavidH, Tanya, Heiko, Chris, PaulT, Judy

Minutes: Tanya


Follow-ups on annotation extensions

Some were discussed at the annotation call on Tuesday. David H to report.

For background, see here: http://wiki.geneontology.org/index.php/Ontology_meeting_2014-11-20#Follow-up:_Relationships_.28Annotation_extensions.29

 Good news: great consistency with general annotations some differences in the annotation extensions
 (BFA1 acts as an inhibitor in the absence of BUB2, acts as an activator in the presence of BUB2)
 Reinstate in_absence_of, in_presence_of?  This paper is a good example of the need for these two. Also going to try to model this in Noctua.

Bad inferences to catabolic process? (DavidH)

I 'fixed' all the NAD/glycolysis ones with a total hack. I made the NAD(P) and all of its children participants instead of inputs. So the rationale is that the only thing that can be input into a catabolic process is the thing being catabolized. However, in the long run I think this strategy will end up overloading the participants in a process. At the end of the day, I think we really need to create is_catabolized and is_synthesized as subtypes of input and output.

 Create redox processes? Pushes the problem one step up. Has_coenzyme_input, has_coenzyme_output. Doesn't work for ATP/ADP.
 Need more specific relations for chemical inputs/outputs. Chris will take care of finding the relations in RO (or elsewhere), adding them 
 and doing the search/replace of the problematic relations behind the scenes. DH will open a SF ticket for this task so that it won't 
 slip through the cracks. 
 

Protein Family binding term creation vs. on the fly binning (DavidH)

I have done an experiment looking at 'x family binding' annotations in MGI versus looking at the total number of annotations to 'protein binding' and its children using Panther families as a grouping. There are a significant number of annotations to 'protein binding' that have a value in the 'with' field that would allow them to be grouped in a family.

 selected two protein families
 neurexin family binding --> PANTHER10127 ---> MGI genes in this family
 hedgehog family binding --> PANTHER11889 ---> MGI genes in this family
 looked for these ids in the with column
 could deepen generic 'protein binding' annotations by looking at the with field, looking at the PTHR family they belong to, and then 
 assigning a more specific term

 next, did VLAD analysis with spiked (deepened) annotations vs. not spiked
 with Term Enrichment analysis, enrichment gets better (more significant) using the calculated, deepened annotations
 idea: generate the family binding terms on the fly
 Chris: Chris' awesome proposal for new Term Enrichment tool

New Protege release

Please test and give feedback (or plan to do so by the next call if possible). For details, see Chris' email 'New Build of Protege 5.0.0 Beta available' dated March 11th.

 So far, biggest bug is that it crashes.  Please try and report the crash if it happens to you.

Follow-up: Acids and bases in GO

Copying from two weeks ago - any action item left for this? Documentation?

Stemming from https://sourceforge.net/p/geneontology/ontology-requests/11524/

The solution we've adopted so far is proving unsatisfactory specifically with respect to NH4+, NH3 and NH2- conjugate-acid-base relations. We need to think of an alternative, with or without the direct involvement of ChEBI.

Does NH2- exist in biology (azanide ion) (Harold: I can't find; it is really unstable; reacts with
water quickly to form NH3 + OH-; Azanides as a class: Chebi has; these aren't like natural metabilites
More like drugs, etc. 
Option: exception to the GCI generation rule (equivalence is transitive so we have to do this): 
for azanide; We should document this somewhere very well for all times we have to do this.

Update from Heiko (March 10th): "as discussed on the call, we have modified the generator for bio-chebi. It now supports exceptions for the generation of the GCI axioms, i.e. ammonia and it's conj base/acids. The current approach to define these exception is a new subset in the seed file for bio-chebi generation called no_conj_equiv. The bio-chebi.owl update file has been committed today. "

  Any actions? Yes, need to remove the asserted links.  Paola's SF item to do this.

results_in strategy

See emails from Chris (threads on go-ontology 'results_in outliers' and 'results_in refactor'). Chris asked for feedback on his proposal. Discuss, then reply to email thread.

results_in refactor: Looks OK, but has many very indirect cases.

(Note that, since generation of Chris' list above, we've approved a couple of terms created with the old reg_by_reg TG template)

 Have added at least three terms with this relation since last discussion but solution has not been finalized and implemented.
 Need to revisit the email thread. Punted to next week, Chris will try to remember.

long term hosting of ontology project and issue tracker

 Planning for SF to disappear at some point.  Makes sense to migrate to github as this is gaining dominance and people are moving
 away from it.  There is a way to manually backup the issues, make a rota to do this. All software related issue tracking is done
 through github.
 Advantages: if version control system is also in github, then changes in the ontology can be auto-linked to the tracker
 Disadvantages: doesn't handle big files well, gafs may be a problem. VCS is more complex than SVN. 
 Possibility: Ontology in github and other files in svn.

Proposed new TG template: channel activity

See e.g. https://sourceforge.net/p/geneontology/ontology-requests/11604/

I (Paola) couldn't find it listed in Jira. Existing terms are already axiomatized.

 Next annotation call: changes in annotation of channel activity and child terms. When we make more specific terms, this will 
 be necessary.
 Paola to edit new term definition (SF request) to reflect specificity.