Taxon-GO Implementation April 2009 onwards

1st April 2009

This work has been on hold for a while as we were waiting for more sophisticated filtering in OBO-Edit to let all editor edit the file while doing normal live file editing. However Chris has asked that we push on without the filtering for a while.


Loaded the files in OE2 and found some file problems. The UnionTerm stanzas were a bit messed up, but were fixed by adding a Typedef for the the union_of relationship, and removing the top term.

Other file errors found and fixed:

Two entire taxa had gone missing from the taxon slim, and we had also lost a union_of term from the UnionTerms file. I do not know how that happened but I have replaced them.

Midori had added a new only_in_taxon link to a union_of term that did not exist. I have made the union_of stanza and added it to the file. I should check with Midori that she knows all the ins and outs of the editing of these files. I did not know she had started.

I have recommitted the edited source files and also saved out and recommitted the all_files_mid_edit.obo file. I did not make any edits so the perl scripts do not need to be run.

Fixed this misformed tag 'name: synonym: "synonym: "synonym: "'in both edit file and source.

Set up WinXP laptop to run the perl scripts to generate the tab delimited file that the users need to act on these links.

TODO: There are a bunch of terms that currently have two only_in_taxon links and they mess up the converstion to tab-delimited. Need to resolve these relationships. Have deleted them from the file for now. This is the list:

GO:0048494	chromatophore ribulose bisphosphate carboxylase complex
GO:0030075	plasma membrane-derived thylakoid
GO:0030094	plasma membrane-derived photosystem I
GO:0030096	plasma membrane-derived thylakoid photosystem II
GO:0031676	plasma membrane-derived thylakoid membrane
GO:0031979	plasma membrane-derived thylakoid lumen
GO:0048493	plasma membrane-derived thylakoid ribulose bisphosphate carboxylase complex
GO:0009521	photosystem
GO:0030077	plasma membrane light-harvesting complex
GO:0042716	chromatophore
GO:0009760	C4 photosynthesis
GO:0009761	CAM photosynthesis
GO:0016168	chlorophyll binding
GO:0030093	chloroplast photosystem I
GO:0030095	chloroplast photosystem II
GO:0030089	phycobilisome

I am wondering what the violations.txt file is in cvs. There is a readme but it is a dead end when you follow the urls.

9th April 2009

Looked through the rest of Michelle's taxon spreadsheet and sent a few questions to her before doing the final edits:

> 2.  regarding "flagellin-based flagella...." terms - I left these as prok,
> but Archaeal flagella have "flagellin-based" flagella too, but the
> flagellins are different than bacterial flagellins ­ and also they have
> different overall flagellar structures.  Therefore, the "flagellin-based"
> terms do not provide a way to distinguish between the Arch and Bact types.
> These will likely need to be revised.

Oh that's good to know. So for now I should make these prokaryote,
but later we might like to make child terms then?

> 3.  Archaea do not have peptidoglycan, but they have other cell wall
> materials that are similar ­ we don¹t have terms to cover those.

Good to know. So I'll leave these as Bacterial for now and we can add Archaeal terms later.

> 4.  I notice that there are terms like ³plant-type cell wall² and
> ³fungal-type cell wall².  Can¹t we then have ³Bacterial-type cell wall² and
> ³Archaeal-type cell wall²?  Or even "bacterial-type flagella" and
> "archaeal-type flagella"?  That would be more clear I think.

Yes that would be fine. Which ones would you like me to add? If you
put a column in the attached spreadsheet and write the names in I can do the edits.

27th/28th April

The prokaryotic terms are now finished and a list of questions has been sent to SGD for the fungal terms.

A plan has been formed to finished the sensu terms and release the file in time to submit a paper to this conference:

IEEE International Conference
on Bioinformatics and Biomedicine (BIBM09)
Washington DC, USA, Nov. 1-4, 2009
Electronic submission of full papers: July 10, 2009