Biological process xp self ProgressNotesNov2008

From GO Wiki
Revision as of 07:36, 9 April 2009 by Jdeegan (talk | contribs)

Jump to: navigation, search

11th November

Jennifer Deegan

In a meeting today at the Editorial Office we figured out who will be responsible for which cross products file. I am responsible for the 'biological_process_xp_self' file.

I am to read through the file and check the biological content of the intersection tags. Any edits are to be made by hand in the file, but OBO-Edit2 is currently able to be used as a browser. Where specific domain knowledge is needed I can send list of terms to people who have the correct expertise.

The hope is that these cross products should be in the editors file by the beginning of January and in the public domain by the end of March.

12th November

I have started trying to work with the files. I loaded up the files list in my import file and made the following notes about what happened.

File warnings:

When I load the files I get two orphans at the top. They are
ID = obol:culmination and
ID = interphase_by_interphase_microtubule_organizing_center

Verification manager warnings on load:

culmination during sorocarp development (GO:0031154) generated 2 warnings:
 The term culmination during sorocarp development (GO:0031154) links to the dangling identifier obol:culmination  The cross product definition of culmination during sorocarp development (GO:0031154) refers to a dangling parent /obol:culmination\. derived_into (OBO_REL:derived_into) generated 1 warning:  The term derived_into (OBO_REL:derived_into) links to the secondary identifier OBO_REL:derives_from has_improper_part (OBO_REL:has_improper_part) generated 1 warning:  The term has_improper_part (OBO_REL:has_improper_part) links to the obsolete term improper_part_of (OBO_REL:improper_part_of) improper_part_of (OBO_REL:improper_part_of) generated 1 warning:  The term improper_part_of (OBO_REL:improper_part_of) links to the obsolete term has_improper_part (OBO_REL:has_improper_part) interphase microtubule nucleation by interphase microtubule organizing center (GO:0051415) generated 2 warnings:  The term interphase microtubule nucleation by interphase microtubule organizing center (GO:0051415) links to the dangling identifier interphase_by_interphase_microtubule_organizing_center  The cross product definition of interphase microtubule nucleation by interphase microtubule organizing center (GO:0051415) refers to a dangling parent /interphase_by_interphase_microtubule_organizing_center\.


OBO-Edit issues:

GUI:
If I do a link search for anything that has is intersection then the last column of the results is very wide and cannot be made smaller.

Link Search usage:
Selecting things in the link search results panel does not result in them being shown in the OTE or in the Graph viewer. The OTE moves to a new place but there is no way of knowing which term I should be looking at.

Font size:
The tab on the text editor has its name text much smaller than the text on all the other component tabs.

Memory:
OBO-Edit ran out of memory and crashed even with the reasoner off on my mac.
Components: 1 OTE, Graphviz Viewer, Graph viewer, text editor, link search + one results panel. The cause of the crash was the graph viewer trying to display a term.
This also happens if I load the files and have a link search + results, a term search (unused), a graph viewer and a parent editor.

Config:
Search doesn't remember config settings over restart

These issues have all been added to the tracker.

Processing XP files to be checked mostly by eye

I have written two scripts to help with checking the XP composition. Currently all we have are term names to go on, and I think that having the defs available alongside term names will save time. The scripts are:

/go/cvs/go/software/utilities/stripGOfile.pl
/go/cvs/go/software/utilities/addDefsToXPfile.pl

The first changes the GO live file to tab-delimited format:

GO:id \t definition

The second takes that file and the XP file with this format:

[Term]
id: GO:0000022 ! mitotic spindle elongation
intersection_of: GO:0051231 ! spindle elongation
intersection_of: part_of GO:0007067 ! mitosis 

[Term]
id: GO:0000070 ! mitotic sister chromatid segregation
intersection_of: GO:0007059 ! chromosome segregation
intersection_of: part_of GO:0007067 ! mitosis

and where a term id is quoted it adds in the def of the term underneath as follows:

[Term]
id: GO:0000022 ! mitotic spindle elongation
def: "Lengthening of the distance between poles of the mitotic spindle." [GOC:mah] 

intersection_of: GO:0051231 ! spindle elongation
def: "The cell cycle process whereby the distance is lengthened between poles of the spindle." [GOC:ai] 

intersection_of: part_of GO:0007067 ! mitosis
def: "Progression through mitosis, the division of the eukaryotic cell nucleus to produce 
two daughter nuclei that, usually, contain the identical chromosome complement to their mother." [GOC:ma, ISBN:0198547684]

Processing file

I have checked the copy of my file with defs into scratch/xps and I will delete terms from that file as they are checked and found to be fine. The file is called biological_process_xp_self_with_defs.obo.

13th November

I have read right through the file and picked out the terms that need changes. These terms are now the only ones in the file xps/biological_process_xp_self_with_defs.obo. I have also made notes on the things that I noticed and on the specific changes needed in the file xps/biological_process_xp_self_notes.txt.

Next I will start making the changes that are needed and will report back to Chris the things that I found.

December

Chris says that when OBO-Edit displays a new relationship type as is_a then we should in fact include the typedef in the relationship file. He is going to arrange that.

I have been looking into the problem of disjoints in the Tree Viewer, and there is a proposal to have a single disjoint relationship between pairs of terms, as the relationship is symmetrical anyway. There is opposition to this proposal, and requests that OBO-Edit be changed to handle the pairs of relationships.

I discovered that the Graph Viewer is loading typedefs from the live file, while the OTE is loading typedefs from other files. I am looking into how relationships are handles generally so as to figure out what needs to be done.

11th December

I have tried loading my files and running the reasoner. The reasoner runs in a minute or so, but fatal errors are produced:

M phase of meiotic cell cycle (GO:0051327) generated 1 error:
 M phase of meiotic cell cycle (GO:0051327) is part of a cycle over the property part_of.

M phase of mitotic cell cycle (GO:0000087) generated 1 error:
 M phase of mitotic cell cycle (GO:0000087) is part of a cycle over the property part_of.

maintenance of turgor in appressorium by melanization (GO:0075043) generated 1 error:
 maintenance of turgor in appressorium by melanization (GO:0075043) 
 has disjoint superclasses multi-organism process and cellular process

meiosis (GO:0007126) generated 1 error:
 meiosis (GO:0007126) is part of a cycle over the property part_of.

mitosis (GO:0007067) generated 1 error:
 mitosis (GO:0007067) is part of a cycle over the property part_of.

smooth muscle relaxation of the bladder outlet (GO:0060085) generated 1 error:
 smooth muscle relaxation of the bladder outlet (GO:0060085) is part of a cycle over the property is_a (OBO_REL:is_a).

synaptic transmission involved in micturition (GO:0060084) generated 1 error:
 synaptic transmission involved in micturition (GO:0060084) is part of a cycle over the property is_a (OBO_REL:is_a). 

12th December meeting

We had a meeting to discuss cross products and we all gave progress updates. We discussed the issue of disjoint symmetry and the idea of only putting one disjoint relationship between pairs of terms. I have committed to test all the OBO-Edit components to see how many are broken by this and how many are fixed. I do not completely understand the situation with the OTE so need to remember to mention that this is not fully tested.

We discussed the idea of saving from OE2 to gene_ontology_write.obo, and I am to write the first draft of the proposal.

1st April 2009

I have commented out all of the problematic terms in the self file and written to Chris to ask if he has a script to move them to the new unvetted file that I have made. David and I are planning to start looking at the paths to root on the vetted terms soon.

3rd April 2009

Participants: David Hill, Jen Deegan

We started working through the unvetted file and making changes. We just changed the live GO graph and made notes on how the rules in obol might be changed, on the understanding the Chris will reparse to improve the intersection tags. We have not edited the intersection tags at all. I made notes on our changes in the unvetted file and each note starts '!!'

I will write to Chris to tell him what we have done.

Later: Chris wrote back to say that we should also update the intersection file by hand in this file as it is not sensible in this case to regenerate the intersection links by obol.

9th April 2009

I have loaded up the intersection file with the current live GO and am checking the ancestors of the logically defined terms to see that they make sense. Previously I only checked that the names and defs of the defined terms and intesection terms made sense.

1) Term GO:0031180 to GO:0031200 have been merged into other terms so I will delete these terms from the intersection file. Same for GO:0031178

I have sent two questions to the ontology developers list:

I have just been looking at the ancestors of my cross product terms and I found 
a weird thing that could maybe be tested for systematically in the other stanzas.
In the Graphviz view below you can see that there is a cycle between 'mitosis', 
'mitotic cell cycle' and 'M phase of mitotic cell cycle'.
The cycle is made up of intersection tags and normal relationships, but it still wonky.
Do you think such cycles could be checked by a script or something?
I don't think we can currently do it in OBO-Edit.

Mitosis.jpg

If I find a graph, like the one attached, where the intersection tags show exactly the same as 
the existing relationships, is it right that I leave both sets in? Would I be right in thinking that 
the normal relationships show what is 'necessary' for the child term, where the intersectionss 
show what is 'necessary and sufficient', and that this is why we keep both?

Relationships.jpg


I am having trouble keeping tabs on which terms I have checked as I was putting !! on the end of the term names in the xp file but these get removed when I load into OBO-Edit.