Chemical terms in GO

From GO Wiki
Jump to: navigation, search

Goal: make terms that refer to chemicals internally consistent in GO prior to alignment with ChEBI

SF items to check as part of this project

Meeting March 5-6, 2010

Chris generated ontology of chemicals named in GO terms, with Chebi IDs - GOCHE. We are using this ontology as the framework to generate the chemical representation that already exists in GO. This file needed review because:

  • GO contains chemicals that could not be mapped to Chebi. These terms will be missing from the file.
  • In some cases the mappings to Chebi were not correct due to different term usage in GO and Chebi. For example, in some cases GO uses a singular chemical name to represent a class of chemicals that contain the same chemical skeleton and in some cases, GO uses the 'chemical and derivative' nomenclature. Chebi uses plurals.
  • In some cases, it was not clear what Chebi term was being mapped to a GO term. In those cases there is a term in GOCHE that we cannot seem to find in GO

All work is archived in:

Go through GOCHE; for each chemical 'X' see if terms exist for

  • X metabolic process
  • X biosynthetic process
  • X catabolic process
  • X transport
  • X transporter
  • X binding

Adjust parentage in GOCHE based on GO paths; add GOCHE terms as needed for chemicals that were not found/mapped in CHEBI. When this happens if there was a CHebi ID for the term, we added a new GOCHE term and added the Chebi ID as a term dbxref. The set of GOCHE terms will identify those that we need to discuss with Chebi.

GOCHE represents the union of the M,C,B,T,R and I paths in GO for a given chemical. There are other paths in less comprehensive areas of the ontology like 'response to chemical' and children.

March 5th: got through lipids and CDP-diacylglycerol
resume at monoglyceride

March 6th: modified approach:

  • Skip noting which types of GO terms exist in GOCHE file. We can mine this later if we need this information. We already have enough mapped to illustrate the point that GO is internally inconsistent.
  • Mark all terms we've looked at with 'GOC:chem_mtg' definition dbxref.
  • Note all parents from GO paths as before.
  • Filled in more chemicals that GO has but ChEBI doesn't; we don't know what they are!

Concentrated on high-level terms by working from the top down, one level at a time, because that's where most problems crop up; easier to sort out more specific terms without meeting face-to-face. Ended Saturday session after getting down to second level of 'heterocycle compound'.

Notes:

  • at present, GO doesn't have paths from nucleoside/tide/base terms to 'aromatic compound' terms, but we will want to add back in GOCHE
  • The 'response to chemical substance' branch isn't consistent with the rest of GOCHE at all. We need to decide if we are going to retain this representation.
  • Note that we have made a few fixes, but have not consistently fixed all problems we've spotted.


GOCHE changes done Sat 6th:

  • merged 'organic alcohol' into 'alcohol'
  • merged hydroxyproline into 4-hydroxyproline; made L-hydroxyproline is_a hydroxyproline
  • removed redundant links
  • changed namespace for new terms to GOCHE from GO
  • resolved multiple definition dbxrefs (GOC:dphtb, GOC:goche, GOC:chem_mtg) to a single reference = GOC:chem_mtg
  • added all versions of files worked on for the project into cvs for versioning control and easy access for all parties
  • at the end of the two days of work (and including the 2 hrs of prep work done by D and T), a total of 406 terms had been reviewed, of which 55 were new terms added to the ontology. Of the 55 new terms added to the ontology, 12 were also found in the CHebi ontology.

Rules for GOCHE

  • if you are X biosynthesis or X catabolism, only follow is_a paths up the graph via X metabolism.
    • use MF-BP links to capture links between (e.g.) pyruvate and glucose, or create specific subchildren of a term like 'pyruvate metabolic process involved in gluconeogenesis'. The way we have it now is incorrect.
  • use ChEBI '-ic acid' ID for '-ate' GO terms - it's the ionized form that's biologically relevant. Note that in this case we will not be univocal with Chebi. (I think this is not correct. We want to use the anion version ID, not the 'acid' ID. TB May 2011)
  • a part_of link in GO should not translate into an is_a link in GOCHE
  • ignore modification terms for building GOCHE paths
  • every chemical with a path to 'small molecule','drug','neurotransmitter' or any other role MUST also have a path classifying it based on structure
  • if GO term is not plural, but GOCHE term points the ID for a Chebi plural, then GO should use an 'and derivative' term. Eventually we will use the definition of the Chebi plural term for the 'and derivative' terms in GO. We like the idea of the skeletal framework being the criteria for these.
  • As we have edited, we have made all of the chemical names univocal with GO. So when we align the GOCHE terms with GO, we will match the strings from GOCHE with term names in GO.

Steps:

Editing schedule:

Tuesdays

Jane or Midori: till 5 pm UK time

David or Harold: 12 to 4 pm East coast time

Tanya: after David and Harold are done

Tips:

Set default namespace to GOCHE.

Add GOC:chem_mtg as a definition dbxref to all terms you look at, either by adding an 'x.' definition to the term or by filling in a missing definition dbxref. You can then use the Term Renderer to see all the terms that have been reviewed and what is still left to do. Definition dbxref = GOC:chem_mtg, foreground color = color of your choice

  1. Finish reviewing GOCHE, adding missing compounds
    • During this process, submit issues/new terms to ChEBI
    • When this process is finished, every GOCHE term will have a chebi id
  2. Refine relationships and upper level structure of GOCHE
  3. Create xps using GOCHE terms
    • Clean-up GO chem terms
    • Automatically fill in missing chemicals
  4. Align GOCHE with chebi

Prose description of plan

We started to add letters after the GOCHE terms to see how internally inconsistent GO was with its representation of chemicals. Tanya and David were already aware that we had many inconsistencies from doing the F-P links. That was why we thought that this project should now take priority. We have stopped adding the letters now, but we were convinced that at the higher levels, GO is inconsistent. We can't just rely on the automatically generated cross products because GO has chemicals that are not represented in CHEBI and in some cases, we need to have a good look to decide if we agree on the mappings between GO and CHEBI; for example, is 'benzenes' equivalent to 'benzene and derivative'? We have not always been consistent in how we group things. Only by looking the way we are can we spot these inconsistencies. For now, we are keeping both the GOCHE term and the CHEBI term. We may at some point merge them, or ask CHEBI to create a new CHEBI term.

The plan is to work only on GOCHE for now. We know that there are areas in it that don't make sense. We will fix and discuss those as we go along. Then in the end, we will see how well GOCHE aligns with CHEBI. Where it doesn't align we need to find out why. Once we have GOCHE in a state where we like it. We will use it in the cross-product definitions of every term with a chemical representation in GO. We can then use the reasoner to be certain that GO remains internally consistent. As we work out the bugs between GOCHE and CHEBI, we will then be aligning with CHEBI.

So GOCHE is being used as a 'metaontology' to show us how we currently represent chemicals in GO. The text strings of chemical names are univocal with GO for now. We will fix the GOCHE metaontology and then use it to clean up the inconsistencies and errors in GO. This way we can keep our own house in order as we work with CHEBI to align. As we are working through GOCHE, we are already sending issues and questions to CHEBI to try to work things out with them.

Progress

  • In the first week of June, Harold and David finished reviewing every term in GOCHE. They should now all reflect the paths that are present in GO in the major areas of the ontology that represent chemicals. We have still not done response to.
    • ACTION ITEM- We still need to go through GO and find other paths in minor areas of the graph. One approach to this will be to search on terms in GO that contain the word chemical and then look at the children.
    • ACTION ITEM- We still need to look for chemicals that we might have missed. Tanya and David will do this using the cross products file that Chris has generated and the term renderer to look for chemical terms that do not have cross-products. The ones we missed in GOCHE pop out.
    • ACTION ITEM- We need to go through all of the GOCHE: terms in GOCHE and see if the chemicals exist in Chebi. If they do, we need to change the GOCHE IDs to CHebi IDs.
    • ACTION ITEM- We need to remove the letters that we appended to the terms the first day of the meeting. DONE

Notes and questions

  • GOCHE doesn't seem to have a term for 'organic compound' (by that or any other name). ChEBI defines 'organic molecular entity' as 'A molecular entity that contains carbon', so we can say organic compound = carbon compound in GO (and rearrange a few existing links accordingly). (-midori 2010-03-08)
  • Added new root 'organic substance' GOCHE:0000001 (CHEBI:50860). Many, many terms must be moved under it in both GO and GOCHE, but I haven't done that yet in either ontology. I can start on both in parallel any time.
  • I encountered a case where different GO terms use different names for the same chemical! It's CHEBI:1427. Neither Jane nor I could recall that situation cropping up during the meeting, so we crafted a strategy on the spot:
    • used the ChEBI name in GOCHE
    • added exact synonyms in GOCHE for each variant occurring in GO
    • used GO IDs as synonym dbxrefs
  • selenium metabolic process is_a sulfur metabolic process WRONG!!!
  • Need to sort out all of the symporters once we have the chemicals worked out.
  • at present, GO doesn't have paths from nucleoside/tide/base terms to 'aromatic compound' terms, but we will want to add back in GOCHE
  • The 'response to chemical substance' branch isn't consistent with the rest of GOCHE at all. We need to decide if we are going to retain this
  • Ex. 'asparagine biosynthetic process from oxaloacetate' is_a oxaloacetate metabolic process and is_a dicarboxylic acid process therefore in goche asn is_a oxaloacetate and is_a dicarboxylic acid. This is wrong. How will we fix this?
  • In GO, ethylene biosynthetic process is_a methionine metabolic process. Is this right?
  • We need to examine all of the detoxification terms. The hierarchy here doesn't hold because these chemicals are not always toxins.
  • We need to examine the racemization terms.
  • We need to look/fix the regulation of xxx terms as well, where xxx contains the name of a chemical.

Action items:

  • divide up remaining GOCHE terms
    • do straightforward ones individually; save up queries/problems for teamwork
  • find chemicals named in GO missing from GOCHE
    • look at children of GO terms that do have corresponding GOCHE entries
  • organic alcohol = alcohol, so fix transport term name - DONE.
  • clean up 'heterocycle' vs 'heterocyclic compound' in both GO & GOCHE
  • clean up hyphenation inconsistency
  • DNA binding & RNA binding
    • separate out hierarchies based on chemistry from those based on SO terms
    • don't forget ncRNA
  • missing synonyms for *acylglycerol B & C
  • merge 'histidine family' terms into 'histidine' terms, since there are no other children
  • rename 'quinone cofactor' terms to just 'quinone'


Phone Conference Meeting May 6, 2010

Attendees: Chris, Jane, Harold, David, Tanya

Meeting Start: 12:05 pm /3:05 pm /8:05 pm

Meeting End: 12:42 pm/ 3:42 pm/8:42 pm

  • Update where we stand with GOChe
  David reviewed procedure for reviewing GOChe - dph and hjd work alphabetically, dbxref NOT 
  contains chem_mtg, look at representation in GOChe and in GO and look for parallels and 
  missing info, add missing info.  For each term, look at parents AND chemicals. There are 
  some chemicals in GO that weren't mined from Chebi in the first place, for example 'acid' - 
  GOChe only had 'organic acid'.   Another new term, 'signal' added - some chemicals only had 
  parents that are signal, no chemical parent currently.
  At current rate, whole graph will be reviewed by late May - early June - GOAL: every
  chemical referenced in GO will be in present in GOCHe
  then will merge GOChe terms with appropriate Chebi terms or create new Chebi terms
  Renaming GOChe terms to match the GO string as we go along?  
  Yes.  Renaming things to be what they are in GO. Rules on what to do with GO using two 
  forms of name (see Midori 2010-03-09)
  Chris suggested loading up CHebi crossproducts at the same time as GOChe, can render 
  things that have a logical definition in bold , then browse the ontology and see which ones 
  are 'stragglers'
  Jane is keeping live GO and GOChe in line, Tanya working alphabetically as well, but 
  slower progress.
  • Discuss a possible face-to-face meeting to implement GOChe in GO and to work with representatives from ChEBI
  Judy could have meeting funds for flying out ChEBI person and Jane from UK and Harold
  and David from ME. Meeting time mid to late July in CA.  Location could be split between
  Carnegie and Berkeley.
  List of attendees:
  GO:Chris, David, Harold, Jane, Tanya
  ChEBI/other:Janna Hastings, Mike Bada, Colin Batchelor
  Need to determine budget. David will email Judy.
  Jane will serve as liaison with ChEBI to find person to come to meeting.
  David will give update of project at next GO managers' meeting.
  • Discuss the game plan for how we will proceed (I think most of these will be done or started at the face2face meeting when we can discuss things)
    • Rearrange GOche to make it correct (curators)
    • Write definitions for GOChe terms (curators)
    • Create GOChe-GO cross-products (Chris and curators)
    • Implement the GOche structure in GO (Chris)
   First pass before meeting with Chebi, check for internal consistency.
    • Find chemicals in GO that are missing from GOChe (Chris and curators)
    • Extend GOChe to MF definitions (eventual integration with BP metabolic processes and external pathways) (Chris and curators)
    • Meet with ChEBI representatives about
  • Write up as paper. Jane will open a document in Google Docs. Note interesting examples.
  • Other business:

Automated alignment with Rhea? Not yet.

Align chemicals in definitions with Chebi, then we can link substrates and products with starts and ends of metabolic/biosynthetic/catabolic processes.

Chemical names also occur in these types of terms:

response to catalysis secretion molecular function terms

Phone Conference Meeting July 15, 2010

Attendees: Chris, Jane, Harold, David, Tanya

Meeting Start: 7:00 am /10:00 am /3:00 pm

Meeting End: 7:40/ 10:40 am /3:40 pm

  • Objectives of meeting
    • Create action items and responsible parties to work with GOCHE for the next week
    • Determine what we need to send to chebi
  • Actions
    • Find the terms in GOCHE that only specify roles. Chris has provided a list of those taken from CHEBI. (HJD,DPH 7/15/10)DONE
      • Make sure that the chemical children are bona-fide chemicals
      • Destroy the terms that are not chemicals, but only define roles. (We will deal with the roles at a later date)
    • Have a first pass at GOCHE clean-up. (HJD, DPH, JL, TZB 7/20/10) Same editing schedule as in our original work.
      • Begin with the top-level terms and make sure they are consistent with the top level of CHEBI
        • Incorporate/create definitions where possible
      • Use Chris' list of non-is_a parallelisms to identify problematic terms. Chris will create 3 lists and distribute them to each group.
        • file is here: [1], there are 4 sets: carbo_slim (299), hydroxides_slim (304), ion_slim (184), the_rest (651)
        • Work through the list fixing the structure of GOCHE as we see fit.
        • In some cases, the problems may have already been solved by the group/person that edited before you.
    • Send a list of the legitimate GOCHE terms that are not in CHEBI to CHEBI and Mike. Ask them to determine if they think the terms are in fact in CHEBI. (Jane 7/15/10)
      • For terms that are in CHEBI change the IDs in GOCHE to CHEBI IDs (before or at meeting)
      • For terms that are not in CHEBI, create term requests for addition to CHEBI. These should include good defs and relationships (no rush on this item)
    • Send GOCHE and the remaining discrepancy file items to CHEBI and Mike (Jane 7/21/10)
      • Chris will run the discrepancy script after Tanya finishes her edits on 7/20/10
      • Jane will ask CHEBI to look at the discrepancies that still exist and give them some pre-thought before the meeting.

Meeting with CHEBI, Carnegie July 22-25, 2010

  1. GO-CHEBI pre-meeting
  2. GO-CHEBI meeting

Phone Conference Meeting Oct 22, 2010

Attendees: David,Tanya,Harold,Chris,Midori,Jane

Meeting Start: 7:04 am /10:04 am /3:04 pm

Meeting End: 7:47am /10:47 am/ 3:47 pm

Agenda

  1. Discuss the status of the paper
    • Chris will write a systems biology paragraph
    • Tanya will format the references
    • David will address other comments
    • Jane will add a bit about ontologies in the intro
    • David will remind Chris about generating reports for term misalignments etc.
  2. Make a strategy about what we need to do next:
    • Update on the submission of terms to ChEBI
      • About 50 have gone in so far. They are going in with the ChEBI direct submission tool. Harold will take on this task with Jane. Jane will train Harold.
    • Change the 'and derivative' and 'compound' terms in the current ontology. Midori- David and Tanya on call.
    • Group- We need to start editing GOChe. Do we have a list of rules/conventions?
      • Chris will generate 2 files: 1) A file of inferred is_a links for GOChe based on is_a paths that exist in ChEBI, but don't exist in GOChe. 2) A file of is_a relationships that exist in GOChe, but don't exist in ChEBI. We will work with these lists to correct GOChe.
    • Can we get an update on whether ChEBI has begun the rearrangements necessary to make the alignment happen? Jane will ask about this.

Phone Conference Meeting Jan 7, 2011

Attendees: David, Tanya, Harold, Midori,

Meeting Start: 1PM EST

Meeting End: 1:40 EST

Agenda

  1. Discuss the status of the paper
  2. Harold will present a progress report on where he is in the process of submitting new terms to ChEBI.
    • Harold will split up the remainder of the 'yes' file and send it to Midori, Tanya and David
    • T,M & D will get submission accounts from CHEBI.
    • We will begin submitting terms as we have time.
    • Midori has fixed almost all of the 'X and Derivative' terms. Those correspond to things that are plural in ChEBI.
  3. Develop a timeline between now and the Boston meeting to be sure we are on track.
    • What absolutely needs to be done?
      • Chris needs to run the cross-products on the current version of GOCHE. He needs to generate a report about what changes will happen in GO and the misalignments that we have with CHEBI.
    • If we don't finish the term submission, what is our contingency plan?
      • Create cross-products with GOCHE IDs.
    • How will we communicate misalignments that ChEBI still needs to work on?
      • We will use Sourceforge.

Phone Conference Meeting Jan 24, 2011

Attendees: Harold, Midori, David, Tanya

Meeting Start: noon EST

Meeting End: 12:35

Next meeting Feb 7 noon EST

Agenda

  1. Discuss the status of the paper- Chris wrote paragraph, Tanya will convert to word, reformat and send it back to me.
  2. Discuss how we are doing submitting terms to ChEBI
    • Harold- 10 terms left to go. Waiting on some SF queries and then should be good to go. Mostly ones where he needs generics.
    • Tanya- She has 5 to go. 3 are SF items that are in progress. Discuss 'terpenoid indole alkaloid' in Boston.
    • David- Many of them are peptides. Decided not to submit them at this time.
    • Midori- Midori sent queries out for nine items.
      • Two (heteropolysaccharide, lysophospholipid) have been deleted in ChEBI; SF items submitted to know the reason why they were deleted.
        • Both reanimated in ChEBI; GOCHE updated, so DONE
      • For peptidyl-amino acids, only one was in GOCHE because others didn't match the initial M/B/C search. Decided to delete peptidyl-citrulline from GOCHE, and not request any peptidyl-aa terms from CHEBI, for now (can revisit later to see if CHEBI wants 'em).
      • Sorted out plans for the rest of the queries - some GO/GOCHE edits, some CHEBI requests. - DONE
  3. Discuss the things on the to-do list from the CA meeting
    • sort out the lipoproteins from the lipoprotein particles in GO (David DONE)
    • sort out what GO means by phosphate and request appropriate terms from CHEBI
    • rename phenol M,B,C to be phenol-containing compound M, B, C - DONE
    • figure out which cobalamin we are talking about? There are several. (David CHEBI:23334)
    • rename toluene M,B,C to be toluene-containing compound M, B, C, revisit definition to be clear- DONE
      • DONE 2011-01-24; new terms for toluene-containing compound M, B, C; definitions consistent for toluene-containing compound and toluene
    • add some vitamins from CHEBI to GO (will do it once we need them)
    • move isoflavonoid xx to be a sibling of flavonoid xx (Tanya DONE)
    • Change goche id for benzene and derivative for the chebi id for benzenes (uncurated) - DONE 2011-01-24
    • GO to rename the GOCHE 'x and derivative' terms to be 'x-containing compound' throughout - DONE 2011-01-24
    • provide CHEBI with a list of the 'x-containing compound' terms for them to add
      • Found these remaining in GOCHE on 7/25
        • DONE 2011-01-24; didn't have to request any new CHEBI terms; specific notes below
        • CHEBI:26373 ! pteridine and derivative
          • now 'pteridine-containing compound' in GO & GOCHE
        • CHEBI:26401 ! purine and derivatives
          • now 'purine-containing compound' in GO & GOCHE
        • CHEBI:33709 ! amino acid and derivative
          • now 'amino acid' in GO & GOCHE
        • CHEBI:39447 ! pyrimidine and derivatives
          • now 'pyrimidine-containing compound' in GO & GOCHE
        • GOCHE:0090038 ! ethanolamine and derivative
          • now CHEBI:23981; 'ethanolamine-containing compound' in GO & GOCHE
        • GOCHE:0090044 ! indole and derivative
          • now CHEBI:24828; 'indole-containing compound' in GO & GOCHE
        • GOCHE:0090047 ! thiamine and derivative
          • now CHEBI:26948; 'thiamine' in GO & GOCHE, with exact synonym 'thiamine-containing compound'; still need to identify a "canonical thiamine"
        • GOCHE:0090050 ! folic acid and derivative
          • now CHEBI:37445; 'folic acid-containing compound' in GO & GOCHE
        • GOCHE:0090055 ! riboflavin and derivative
          • now CHEBI:305277; 'flavin' in GO & GOCHE
        • benzene and derivative < removed since last? can't find
          • was GOCHE:0090042; now CHEBI:22712; 'benzene-containing compound' in GO & GOCHE
        • phenol and derivatives < removed since last? can't find
          • now CHEBI:33853; 'phenol-containing compound' in GO & GOCHE
    • check through the remaining hormone terms to retain only the peptide hormones, not proteins (did this, we have a couple that are 191 or 198 aa, are these peptides?) (David DONE decided just not to request anything encoded by the genome)
    • when CHEBI has fixed arrangement of nucleoside phosphate/nucleotide, need to fix GOCHE accordingly (can we write formal definitions for this problem and use this example for the paper?)
    • Move hypoxanthine up to be a child of 'purine-containing compound' (when we have that term) - DONE
    • GOCHE needs to change NAD to NAD+ (Harold)
      • what I did was change NAD+ or NADH and NADP+ or NADPH in both GOCHE and GO to NAD and NADP (a single binding term for each) and moved old
      • name to a synonym. In each case the single term in GOCHE was mapped to the appropriate CHEBI NAD and NADP terms
    • Generate a list of missing conjugate bases for submission to CHEBI. CHEBI will check if the entry already exists, and if it does make the appropriate is_conjugate_base relationship and if not, add it.
    • deal with the molydopterin cofactor issue on the wiki (Midori)
      • ChEBI SF tracker item open [2]

Phone Conference Meeting February 7, 2011


Phone Conference Meeting With CHEBI, September 14, 2011

Attendees:

Meeting Start: 7:00 am /10:00 am /3:00 pm

Meeting End: 8:00 am /11:00 am/ 4:00 pm

Agenda

  • David to present overview of where the alignment of chemicals in GO is up to.
  • Discuss what changes still need to be made in the ChEBI ontology.


Phone Conference Meeting With CHEBI, October 19, 2011

Attendees:

Meeting Start: 7:00 am /10:00 am /3:00 pm

Meeting End: 8:00 am /11:00 am/ 4:00 pm

Agenda

  • Discuss CHEBI's strategy for handling cases where CHEBI currently uses is_a to functional groups.
  • Discuss agenda for London meeting (after the GOC meeting on the morning of Wed 9th Nov at University College London).


Meeting With CHEBI, University College London, November 9, 2011

Attendees:

Meeting Start: about 10:30 am

Meeting End: about 2:00 pm

[Minutes http://wiki.geneontology.org/index.php/GOCHE_meeting_2011-11-09]

GOCHE id terms remaining

Please update with ones you are working on. If they are done, you can delete them from the list. Thanks!

  1. GOCHE:0000324 1-carbon compound
    • (Midori) obsolete - metabolic pathway grouping
  2. GOCHE:0000245 3-keto-sphinganine
    • (Harold) in file I just checked out this is not in there; it is CHEBI:46966; there is no GOCHE:0000245
  3. GOCHE:0000067 achromobactin
    • (Tanya), added CHEBI:61346
  4. GOCHE:0000066 acyl
    • (Midori) upgraded to CHEBI:22221 acyl group (exact synonym added)
  5. GOCHE:2000000 adenine-type cytokinin
    • not on yes/maybe/no list
    • (Midori) obsolete because cytokinin is a role in ChEBI; added 6-isopentenylaminopurine CHEBI:38643 to group kids
  6. GOCHE:0000156 arsenic
    • (Midori) obsolete - superseded by existing ChEBI terms
  7. GOCHE:0000374 bacillithiol
    • (Midori) submitted - CHEBI:61338
  8. GOCHE:0000153 boron
    • (Midori) obsolete - superseded by existing ChEBI terms
  9. GOCHE:0000086 C4-dicarboxylate
    • (Midori) submitted - CHEBI:61336
  10. GOCHE:0000177 carbon catabolites
    • not on yes/maybe/no list
    • (Midori) obsolete - gene expression regulation pathway grouping
  11. GOCHE:0000099 chemical
    • (Midori) upgraded to CHEBI:24431 chemical entity (exact synonym added)
  12. GOCHE:0000068 chrysobactin
    • done hjd
  13. GOCHE:0000072 ferric-enterobactin
    • CHEBI:28199
    • on 'yes' list (whose?)
    • done hjd
  14. GOCHE:0000074 ferric-vibriobactin
    • on 'yes' list (whose?)
    • CHEBI:61375 submitted
  15. GOCHE:0000069 ferrichrome
    • on 'yes' list (whose?) CHEBI:61414
  16. GOCHE:0000042 globoside
    • hjd done; had to enter as chemical entity harold1
  17. GOCHE:0000002 inositol or phosphatidylinositol
    • not on yes/maybe/no list
    • this is a term I suggest has to be obsoleted. because of parentage; hjd
    • obsoleted
  18. GOCHE:0000079 iron chelate
    • (Midori) obsolete - role
  19. GOCHE:0000089 NAD or NADH
    • not on yes/maybe/no list
    • (Midori) merged into CHEBI:13389; more specific CHEBI:15846 added
  20. GOCHE:0000090 NADP or NADPH
    • not on yes/maybe/no list
    • (Midori) merged into CHEBI:25523; more specific CHEBI:18009 & CHEBI:16474 added
  21. GOCHE:0000178 nitrogen catabolites
    • not on yes/maybe/no list
    • (Midori) obsolete - gene expression regulation pathway grouping
  22. GOCHE:0000022 nucleobase, nucleoside and nucleotide
    • not on yes/maybe/no list (GOCHE:0090015, nucleobase, nucleoside, nucleotide and nucleic acid was)
    • renamed 'nucleobase-containing small molecule', removed from GOCHE, will not map to a CHEBI term
  23. GOCHE:0090022 O3-(N-acetylglucosamine-1-phosphoryl)-L-serine
    • fixed; now chebi 21977 (it was in there both as a chebi AND as a GOCHE (hjd)
  24. GOCHE:0000141 oligogalacturonide
    • (Tanya) SF item open
  25. GOCHE:0000215 organic nitrogen
    • (Midori) upgraded to CHEBI:35352 organonitrogen compound (exact synonym added)
  26. GOCHE:0000224 oxygen and reactive oxygen species
    • (Midori) GOCHE term obsolete; GO terms scheduled for the axe (waiting out notification period, then bye-bye!)
  27. GOCHE:0000248 phosphatidyl-N-monomethylethanolamine
    • I've got it CHEBI:15958 hjd
  28. GOCHE:0000001 sulfolipid
    • not on yes/maybe/no list
    • (Tanya) Added CHEBI:61384
  29. GOCHE:0000106 TDP
    • hjd done requested CHEBI:61377
    • since the def as well as the parentage indicate ribosyl,wonder if I should just leave the name as is; could put rTDP as synonym
  30. GOCHE:0000244 terpenoid indole alkaloid
    • (Tanya) SF item open

Boston meeting

Logistics

   Boston University
   Life Sciences & Engineering Building
   24 Cummington Street, Room 704
   Boston MA 02215
   Google Map: http://tinyurl.com/4ctlk3f
   We have the room from 9-5 on Saturday and Sunday February 26-27, 2010.  

Topics

  • How will we deal with GO terms that contain 'chemical' names that are not in CHEBI but are interspersed between GO terms that do have CHEBI terms in them? Example: 'raffinose family oligosaccharide biosynthetic process'. We have removed this from GOCHE, there is no term in CHEBI.
    • Option 1: remove this term from GO
    • Option 2: work around it (how?) These terms are grouped by the process, therefore I don't think they should get chemical cross-products. For now, I think they will simply have to be maintained by hand.
  • Chris will try to get a list of ChEBI submissions that were made by David, Tanya, Midori and Harold
  • Align GO with ChEBI. Make a list of issues that need to go to ChEBI.
  • We will decide at the meeting how we are going to incorporate this into GO. Will it live as a separate cross-product file, or will it be incorporated directly?
    • Load 4 files to view GOCHE and GO together. Saving out is tricky if impossible. Try to move to having GOCHE in the editors file and going
  • Make list of 'chemical'-sounding terms in GO that don't have ChEBI cross-products and figure out how to make logical definitions for them. These will include the 'roles', peptides, 'amino acid family' and other such terms.
  • Should the lipoprotein metabolism terms be lipoprotein particles? Are bacterial lipoproteins particles?

Day 1

  • Review the cross-product terms. Remove those that are not correct.
    • Chris will filter out x-products that are based on obsolete GOCHE terms
    • Looked at all terms to make sure that xps to GOCHE were good.
      • Problem example: thiamine-containing compound in GO mapped to CHEBI:id, thiamine, which is defined as 'thiamine-containing' compound
      • CHEBI to-do: change name of CHEBI:26948 thiamine to thiamine-containing molecular entity
      • CHEBI to-do: add new CHEBI term for the 'real' thiamine
      • GO to-do: look at all exact synonyms in GOCHE file with this meeting's id (GOC:gochemicals_2010226) and resolve those naming inconsistencies in GO (for example, rename 'thiamine pyrophosphate' terms to be 'xx diphosphate' terms, and add xxx pyrophosphate terms)
      • GO to-do: tetrahydrobiopterin binding needs to be renamed to tetrahydrobiopterin-containing compound binding
      • CHEBI to-do: request tetrahydrobiopterin-containing compound or something like that as a CHEBI term to link to for the xps
      • GO to-do: change 'organomercurial transport' to 'organomercury compound' transport.
      • CHEBI to-do: CHEBI: 52684 butanediol is not a child of CHEBI:22944 butanediols
      • CHEBI to-do: request butane-2,3-diol, [2,3-butanediol (not enantiomer-specific)], and add links between new term and existing, CHEBI:new is_a CHEBI:52684, CHEBI:52684 is_a CHEBI:22944, CHEBI:16982 is_a CHEBI:new, CHEBI:16812 is_a CHEBI:new
      • GOCHE to-do: add new term as stand in for above request
      • GO to-do: systematically look at the definitions for the 'acid' terms and make sure that the definitions specify the protonation state at physiological pH.
      • Problem: For polyprotic acids, GO wants 'xxx acid' as a group of structures (citric acid, citrate 1-, citrate 2-, citrate 3-: all of them together) and not as a role
      • Solution: For all acids, follow the ChEBI is_a hierarchy. Find the correct state of the polyprotic acid at physiological pH, and use that specific anion for making the ChEBI xp and follow the ChEBI is_a hierarchy for that cases.
    • PM: mop up acids, get ChEBI id of acid, see if conj base exists in GOCHE, if yes, merge acid into base, if not, change acid that's currently in GOCHE into the conjugate base
      • GOCHE to-do: clean up remaining fatty acids, may need to request fatty acid anion from CHEBI
      • GO to-do: make all 'gibberellin xxx' terms is_a monocarboxylic acid XXX.

We went through the abduced links from the union of CHEBI and GOCHE (i.e. links in the GO chemical hierarchy that can be explained if we add a link to GOCHE, but cannot be explained based on current GOCHE/CHEBI). Based on this we added 41 links to GOCHE, and made nnn fixes to GO.

The remaining proposed additions to GOCHE have not been acted on. Many of these will require has_part links added to CHEBI:

#CHEBI:15625-S-adenosylmethioninamine   CHEBI:16389-ubiquinone NO
#CHEBI:16038-phosphatidylethanolamine   CHEBI:16000-ethanolamine OK
#CHEBI:16057-dimethylallyl diphosphate  CHEBI:16019-prenol
#CHEBI:17460-lipoamide  CHEBI:30313-lipoate
#CHEBI:22860-betaine    CHEBI:29067-carboxylic acid HP

#CHEBI:26199-polyprenol CHEBI:16019-prenol HP

#CHEBI:26435-pyrimidine deoxyribonucleoside triphosphate        CHEBI:19255-pyrimidine deoxyribonucleoside NO, HP

#CHEBI:26562-ribose phosphate   CHEBI:16988-D-ribose NO, HP
#CHEBI:30990-isopanose  CHEBI:28191-6-alpha-maltosylglucose
#CHEBI:35782-diphosphate        CHEBI:26020-phosphate NO, HP
#CHEBI:59640-N-acetylglucosamine        CHEBI:18133-hexose NO, HP?
#CHEBI:60190-homogalacturonan   CHEBI:17309-pectin is pectin as role? - pectin is a mix of complex polysaccharides including homogalacturonan, RGI, RGII and substituted galacturonan
#CHEBI:60191-rhamnogalacturonan II      CHEBI:17309-pectin -as above
#CHEBI:60193-queuosine  CHEBI:20794-7-methylguanosine
#CHEBI:60973-suberin    CHEBI:35366-fatty acid NO, HP


Day 2

  • acid-conjugate base mapping completed
  • Files reviewed:
    • acid-conjugate base mapping
    • relationship in GO, not in GOCHE or CHEBI (see report at end of Day 1)
    • relationship in GO and in CHEBI, not GOCHE (27)
      • reviewed and added some relationships that were good, some others were flagged for CHEBI/GO review
#CHEBI:15346-coenzyme A CHEBI:26399-purine ribonucleoside
#CHEBI:15713-UTP        CHEBI:39446-pyrimidine ribonucleoside NO
#CHEBI:15978-glycerol-3-phosphate       CHEBI:26020-phosphate NO, HP
#CHEBI:16695-UMP        CHEBI:39446-pyrimidine ribonucleoside NO, HP
#CHEBI:17345-GMP        CHEBI:26394-purine nucleoside NO, HP
#CHEBI:17361-CMP        CHEBI:39446-pyrimidine ribonucleoside NO, HP
#CHEBI:17552-GDP        CHEBI:26394-purine nucleoside NO, HP
#CHEBI:17659-UDP        CHEBI:39446-pyrimidine ribonucleoside NO, HP
#CHEBI:17677-CTP        CHEBI:39446-pyrimidine ribonucleoside NO, HP
#CHEBI:24040-flavin adenine dinucleotide        CHEBI:36976-nucleotide NO, HP
#CHEBI:49183-phosphatidylcholine        CHEBI:26469-quaternary ammonium group NO, HP?
    • relationship in GOCHE not in GO for all cases (may be there for transport, not for metabolism)
      • Pellet analysis has many suggested new is_a links for GO based on GOCHE.
      • EQUIVALENT results show those GO terms that have the same cross-product definition.
      • merged 'butyrate' and 'butanoic acid' GO terms and 'lipoate' and 'lipoic acid' GO terms.
    • relationship in GO only justified by traversing has_functional_parent in CHEBI
      • 19 to review, either fixed GO or fixed GOCHE to correct errors, mostly appear to be because the relationship in CHEBI should be has_part and not has_functional_parent
    • relationship in GO only justified by traversing has_part in CHEBI
      • 2 to review: hypochlorous acid metabolic process is_a halogen metabolic process (GET RID OF!)
      • vitamin A transport is_a terpenoid transport (OBsoleted vitamin A in GOCHE.)
  • Discussion about whether anything in GOCHE should be (e.g.) has_part amine
    • David says no, should be is_a
    • Midori agrees:
      • amine has_part amino group; alcohol has_part hydroxyl group; etc.
      • these are the chemical definitions, and are ChEBI's share of this work

Post-meeting notes

  • Ask ChEBI about nucleotides - do they have 1-3 phosphates only? what about guanosine tetra and penta phosphate?
  • What do we mean by 'thymidine' deoxy or not? Example, 'thymidine kinase activity.'
  • ChEBI: hemoglobin is NOT a protein