Ontology meeting 2012-05-02: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Ontology]]
MINUTES: PAOLA
MINUTES: PAOLA


30-minute call
30-minute call
Attending: Paola, Becky, Jane, David, Tanya, Chris, Heiko, Karen, Harold




Line 18: Line 21:
The downside, I guess, is that we can't leverage the CHEBI structure to place new terms by using TG0 and cut/pasting?"
The downside, I guess, is that we can't leverage the CHEBI structure to place new terms by using TG0 and cut/pasting?"


We keep having frequent requests for chemical terms in SF, so it would be helpful to know how to proceed as effectively as possible in the meantime. But if it's just a matter of waiting less than a month, we can create the terms ourselves. We (EBI Eds.) assume that we do not need to add to the goche file, but we still need to add to go_xp_chebi, is that correct?
We keep having frequent requests for chemical terms in SF, so it would be helpful to know how to proceed as effectively as possible in the meantime. But if it's just a matter of waiting less than a month, we can create the terms ourselves. We (EBI Eds.) assume that we do not need to add to the goche file, but we still need to add to go_xp_chebi until TG is ready, is that correct?
 
Jane asks: There may be some xps missing from go_xp_chebi for new terms where we forgot to add them. Is there a way we can get a list of all terms that looks like they should be in go_xp_chebi but aren't?
 
Chris (by email): You can either load x-chemical-importer in protege and look for all the classes in the metabolic branch of GO that don't have a "=" in their circle
 
Or you can load go + gp_xp_chebi + chebi in OE and look for the non-bold terms.
(Tanya: for the chebi file, you can use http://purl.obolibrary.org/obo/chebi.obo)
 
In OE you can also do string queries like "matches 'metabolic'" and look for the non-bold in the result, or do a boolean query for string matches and no logical definition. 
 
  Problem:
  Load Error, line 10881 40 unrecognized parent terms:      
    line 14151: GOCHE:0000191 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 4906: CHEBI:707696 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 5611: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 5961: CHEBI:126971 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 2921: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 3761: CHEBI:4431 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 10601: GOCHE:0000148 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 14186: GOCHE:0000305 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 13831: GOCHE:0000351 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 14601: CHEBI:100436 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 6861: CHEBI:5254 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 3996: GOCHE:0000244 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 11016: GOCHE:0000192 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 7886: CHEBI:4431 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 4206: CHEBI:26163 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 7866: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 15136: CHEBI:2846 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 14301: GOCHE:0000304 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 4371: GOCHE:0000235 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
    line 10881: GOCHE:0000190 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo
 


Jane asks: There may be some xps missing from go_xp_chebi for new terms where we forgot to add them. Is there a way we can get a list of all terms that looks like they should be in go_xp_chebi but aren't?
Short-term strategy to add chemical terms in GO: (this is an interim solution, but it may be good for a while)
We will use axioms sitting outside the ontology; these are invisible but we can look at them in Protege (the file already exists, it's in the extensions directory and is called bio-chebi.owl; we can also look at x-chemical-importer.owl in P4). For our purposes, using this strategy acids and conjugate bases will be equivalent and non-distinguishable, e.g. citric acid = citrate. This operating procedure is possibly a bit dodgy and biologically weird, but it will work for editing and for adding terms through TermGenie. If later on we need to formally distinguish between the acid and base, we'll be able to accommodate that. We may want to do that for at least some specific areas of the ontology, such as transport across membrane, where it really is only one of the two compounds that is being transported, and a distinction between acid and base may be necessary. In the future, we might even just use the conjugate base as we said originally.
 
As a long-term plan, we need to add the missing links. The acid-conjugate base relationship is still the most urgent for us, so we'll keep pushing chebi for it. Also, David and Tanya had requested new terms to ChEBI, and they'll follow up with them to see what's still missing.
 
The new TermGenie template for metabolism terms may be ready soon (possibly less than two weeks). In the meantime, if adding new metabolism terms in GO, we do NOT need to add the chemical in goche.obo. If the chemical is not in ChEBI however, request it to them. We DO need to add the xps in go_xp_chebi though. (Alternatively, we could leave any pending SF request for metabolic terms until TG is ready, and use them for testing.) David and Tanya will look into any missing xps once the error listed above is solved (they have prepared a list of questions for ChEBI to this purpose). The EBI editors can then add them in the file.




Line 32: Line 73:


Jane suggests to form a working group including someone from Rhea to give us a hand, and to do a first pass to see what EC numbers we may have missed so far.
Jane suggests to form a working group including someone from Rhea to give us a hand, and to do a first pass to see what EC numbers we may have missed so far.
(IN PROGRESS - Waiting for some key people to come back from travels.)




===DISCUSSION ITEM III: Follow-up on papers: ChEBI and TermGenie===
===DISCUSSION ITEM III: Follow-up on papers: ChEBI and TermGenie===


For the TG paper, to answer the question
ChEBI paper:
 
We need to modify the section on acids and bases to reflect our interim strategy above.
We need to update the methods and include reference to the new TG template.
AI for Chris: he will update the methods accordingly, then send the paper back to all authors for review, then we're ready to submit it (to Nature Chemical Biology).
 
TG paper: answer the question:


"TO DO find rate of new term requests. How many of these fit into a template form?"
"TO DO find rate of new term requests. How many of these fit into a template form?"
Line 42: Line 92:
See http://gocwiki.geneontology.org/index.php/Template-able_Requests_Prior_To_TG  
See http://gocwiki.geneontology.org/index.php/Template-able_Requests_Prior_To_TG  


(Action Item for Becky, Jane and Paola: we'll add to it as we go along)
Action Item for Becky, Jane and Paola. DONE - half of all new term requests (51.8% to be precise) between Jan. and June 2010 could have been dealt with, at least in part, using TG templates.
 


===DISCUSSION ITEM IV: Update on legacy terms without logical definition ===
===DISCUSSION ITEM IV: Update on legacy terms without logical definition ===
Line 56: Line 105:
What's the plan here? Is everyone aware of the bold / not bold highlighting in OE - this makes it easy to search for strings that lack logical definitions. We need to stay on top of these. There was a tracker item about the involved in ones a while ago, but it just dropped off the radar. Maybe we need to have a person/role whose task is to stay on top of this. In theory it should just be a retrospective task if new terms come in via TG -- cjm
What's the plan here? Is everyone aware of the bold / not bold highlighting in OE - this makes it easy to search for strings that lack logical definitions. We need to stay on top of these. There was a tracker item about the involved in ones a while ago, but it just dropped off the radar. Maybe we need to have a person/role whose task is to stay on top of this. In theory it should just be a retrospective task if new terms come in via TG -- cjm


Tanya and I will find the orphans and fix them. -- dph
AI: David and Tanya will find the orphans and fix them. -- dph (To find orphans in OE, search for 'involved in' terms, AND don't have 'Is intersection'.)
 
As a more general SOP for all editors, we should all add missing logical definitions when we spot them.

Latest revision as of 14:38, 1 July 2014

MINUTES: PAOLA

30-minute call

Attending: Paola, Becky, Jane, David, Tanya, Chris, Heiko, Karen, Harold


DISCUSSION ITEM I: Follow up on chemical xps

See previous discussion here: http://wiki.geneontology.org/index.php/Ontology_meeting_2012-04-11

This was followed up by an email thread on the go-editors list. Could we please discuss Jane's questions on the latest message in the thread:

"I think we should definitely do that [i.e. in response to Chris' suggestion "just switch TG to point to UCHEBI and request further changes in CHEBI directly"]. So then new chemicals would only need to go into CHEBI, and we would add xps by hand to x-chemical-edit.obo?

And then we wait to see if Heiko can figure a way to write the xps from TG1.

  • Update from Heiko and Chris: TG is almost ready to write to the file directly

The downside, I guess, is that we can't leverage the CHEBI structure to place new terms by using TG0 and cut/pasting?"

We keep having frequent requests for chemical terms in SF, so it would be helpful to know how to proceed as effectively as possible in the meantime. But if it's just a matter of waiting less than a month, we can create the terms ourselves. We (EBI Eds.) assume that we do not need to add to the goche file, but we still need to add to go_xp_chebi until TG is ready, is that correct?

Jane asks: There may be some xps missing from go_xp_chebi for new terms where we forgot to add them. Is there a way we can get a list of all terms that looks like they should be in go_xp_chebi but aren't?

Chris (by email): You can either load x-chemical-importer in protege and look for all the classes in the metabolic branch of GO that don't have a "=" in their circle

Or you can load go + gp_xp_chebi + chebi in OE and look for the non-bold terms. (Tanya: for the chebi file, you can use http://purl.obolibrary.org/obo/chebi.obo)

In OE you can also do string queries like "matches 'metabolic'" and look for the non-bold in the result, or do a boolean query for string matches and no logical definition.

  Problem:
  Load Error, line 10881 40 unrecognized parent terms:      
    line 14151: GOCHE:0000191 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 4906: CHEBI:707696 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 5611: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 5961: CHEBI:126971 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 2921: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 3761: CHEBI:4431 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 10601: GOCHE:0000148 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 14186: GOCHE:0000305 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 13831: GOCHE:0000351 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 14601: CHEBI:100436 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 6861: CHEBI:5254 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 3996: GOCHE:0000244 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 11016: GOCHE:0000192 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 7886: CHEBI:4431 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 4206: CHEBI:26163 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 7866: CHEBI:23008 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 15136: CHEBI:2846 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 14301: GOCHE:0000304 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 4371: GOCHE:0000235 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo 
    line 10881: GOCHE:0000190 of file:/Users/tanyaberardini/go_svn/ontology/editors/go_xp_chebi.obo


Short-term strategy to add chemical terms in GO: (this is an interim solution, but it may be good for a while) We will use axioms sitting outside the ontology; these are invisible but we can look at them in Protege (the file already exists, it's in the extensions directory and is called bio-chebi.owl; we can also look at x-chemical-importer.owl in P4). For our purposes, using this strategy acids and conjugate bases will be equivalent and non-distinguishable, e.g. citric acid = citrate. This operating procedure is possibly a bit dodgy and biologically weird, but it will work for editing and for adding terms through TermGenie. If later on we need to formally distinguish between the acid and base, we'll be able to accommodate that. We may want to do that for at least some specific areas of the ontology, such as transport across membrane, where it really is only one of the two compounds that is being transported, and a distinction between acid and base may be necessary. In the future, we might even just use the conjugate base as we said originally.

As a long-term plan, we need to add the missing links. The acid-conjugate base relationship is still the most urgent for us, so we'll keep pushing chebi for it. Also, David and Tanya had requested new terms to ChEBI, and they'll follow up with them to see what's still missing.

The new TermGenie template for metabolism terms may be ready soon (possibly less than two weeks). In the meantime, if adding new metabolism terms in GO, we do NOT need to add the chemical in goche.obo. If the chemical is not in ChEBI however, request it to them. We DO need to add the xps in go_xp_chebi though. (Alternatively, we could leave any pending SF request for metabolic terms until TG is ready, and use them for testing.) David and Tanya will look into any missing xps once the error listed above is solved (they have prepared a list of questions for ChEBI to this purpose). The EBI editors can then add them in the file.


DISCUSSION ITEM II: EC numbers in GO

The Enzyme Commission has been updating EC numbers at a very high rate lately.

If we wish to continue to maintain the EC xrefs, we need a strategy to make this viable, possibly by automating part of the work.

  • Can we formally collaborate with them? What is their development process? In a sane world we would have a single ontology that we both work on, and new reactions are added with logical defs to CHEBI. In fact, they should add EC numbers using TG. The hierarchical IDs would pose a challenge but this is doable. --cjm

Jane suggests to form a working group including someone from Rhea to give us a hand, and to do a first pass to see what EC numbers we may have missed so far.

(IN PROGRESS - Waiting for some key people to come back from travels.)


DISCUSSION ITEM III: Follow-up on papers: ChEBI and TermGenie

ChEBI paper:

We need to modify the section on acids and bases to reflect our interim strategy above. We need to update the methods and include reference to the new TG template. AI for Chris: he will update the methods accordingly, then send the paper back to all authors for review, then we're ready to submit it (to Nature Chemical Biology).

TG paper: answer the question:

"TO DO find rate of new term requests. How many of these fit into a template form?"

See http://gocwiki.geneontology.org/index.php/Template-able_Requests_Prior_To_TG

Action Item for Becky, Jane and Paola. DONE - half of all new term requests (51.8% to be precise) between Jan. and June 2010 could have been dealt with, at least in part, using TG templates.

DISCUSSION ITEM IV: Update on legacy terms without logical definition

There are still lots of terms with labels like

  • X involved in Y
  • X by Y

That lack a logical definition. Note that this means TG will give incomplete inferences for these templates, as the above terms are effectively invisible to it.

What's the plan here? Is everyone aware of the bold / not bold highlighting in OE - this makes it easy to search for strings that lack logical definitions. We need to stay on top of these. There was a tracker item about the involved in ones a while ago, but it just dropped off the radar. Maybe we need to have a person/role whose task is to stay on top of this. In theory it should just be a retrospective task if new terms come in via TG -- cjm

AI: David and Tanya will find the orphans and fix them. -- dph (To find orphans in OE, search for 'involved in' terms, AND don't have 'Is intersection'.)

As a more general SOP for all editors, we should all add missing logical definitions when we spot them.