Editors cross-product implementation plan: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(22 intermediate revisions by 5 users not shown)
Line 1: Line 1:
__TOC__
* Start with internal xps first - do external later. Files distributed as follows:
* Start with internal xps first - do external later. Files distributed as follows:


  [XP:biological process_xp-self] - Jen
  [[XP:biological process_xp_self]] - Jen & David
  XP:cellular component xp go - Midori
  [[XP:cellular component_xp_go]] - Midori
  XP:cellular component xp self - Jane
  [[XP:cellular component xp self]] - Jane
  XP:molecular function xp regulators - David & Tanya
  [[XP:molecular function xp regulators]] - David & Tanya
  XP:biological process xp regulation - David & Tanya
  [[XP:biological process xp regulation]] - David & Tanya
  XP:biological process xp multi organism process - Jane
  [[XP:biological process xp multi organism process]] - Jane
  XP:biological process xp cellular component - Midori
  [[XP:biological process xp cellular component]] - Midori


* First phase will be transition phase = files being edited outside of GO files. Essentially the person in charge of that file will check all of the XP definitions are okay, and edit as they see fit. The editing will probably be done by hand in the text file and committed back to cvs.  
* First phase will be transition phase = files being edited outside of GO files. Essentially the person in charge of that file will check all of the XP definitions are okay, and edit as they see fit. The editing will probably be done by hand in the text file and committed back to cvs.  
Line 45: Line 47:


If you choose the set you are interested in and click the link in the table, then that takes you to a page all about that set.  
If you choose the set you are interested in and click the link in the table, then that takes you to a page all about that set.  
In this example we will assume that you are using the [[XP:biological_process_xp_self]] set.


===Read about the set and load the files===
===Read about the set and load the files===


On the page is some information about the links and the import file that shows which files to load to start looking at the interesction tags.  
On the page is some information about the links and the import file that shows which files to load to start looking at the interesction tags. For the [[XP:biological_process_xp_self]] set the import file looks like this:


For example for the [[XP:biological_process_xp_self]] set the [http://www.geneontology.org/scratch/xps/biological_process_xp_self-imports.obo import file] is shown on its page.
import: biological_process_xp_self.obo
import: ../../ontology/editors/gene_ontology_write.obo


To load the files you will first need to check them out from cvs. You can find out which to check out by looking at the paths in the import file.  
To load the files you will first need to check them out from cvs. You can find out which to check out by looking at the paths in the import file.  
Line 56: Line 61:
===Check the links===
===Check the links===


Currently we can check the links in OBO-Edit2 but any alterations must be made in the text file of intersection links. Any alterations should be reported to Chris by e-mail.  
For the biological_process_xp_self.obo file we can now make all edits in OBO-Edit. For some other files we can currently check the links in OBO-Edit2 but any alterations must be made in the text file of intersection links. Any alterations should be reported to Chris by e-mail.  
 
It can be helpful to show terms with intersection tags highlighted in bold. To do this use the global render shown below.
 
[[image:IntersectionBold.jpg]]


===Save the files===
===Save the files===


We are not currently able to save the intersection links out of OBO-Edit.  
To save the intersection links and live file separately it is necessary to use filters in the save dialog and then process one of the files with a perl script.
 
First set up the filters below.
 
The first filter produces a file much like the usual live file. However, the file contains only those stanzas that include an intersection tag.
 
[[image:IntersectionSaveFilter.JPG]]
 
The second filter also saves out a file much like the normal live file. However, the link filter ensures that only those relationship lines that do not include an intersection tag are saved into the file.
 
[[image:LiveFileSaveFilter.JPG]]
 
Next process the intersection file (in this case biological_process_xp_self-edited-filtered.obo) using obo-filter-tags.pl. This script is provided as part of go-perl. To find the script, check out [http://geneontology.cvs.sourceforge.net/viewvc/geneontology/go-dev/go-perl/ go-perl] from GO SourceForge CVS, or just check out the [http://geneontology.cvs.sourceforge.net/viewvc/geneontology/go-dev/go-perl/scripts/obo-filter-tags.pl?view=log obo-filter-tags.pl script]. You can also [[checkout_go-perl_from_cvs|check out go-perl from cvs]] on the command line or in a Windows CVS GUI.  
 
To process the file in a unix environment use the following command:


perl obo-filter-tags.pl -t id -t intersection_of --no_header
biological_process_xp_self-edited-filtered.obo > biological_process_xp_self.obo


Or in windows you can [[run_obo-filter-tags.pl_in_Eclipse]].


After these filtering steps the two files will closely resemble the original formats of the intersection and live files. However you should examine a diff carefully and make any further adjustments that are needed. There may be relationship Typedefs at the bottom of the live file and there may be extra lines in the headers.


===Additional workflow notes===
Plan as of June 2009


*Short term (i.e. at least next several months)
**leave both asserted links and intersection_of tags in go_write and go_ext, even if they're redundant
**strip intersection_of tags from files downstream of go_ext
**include [Typedef] stanzas for relations used, but don't include the rest of RO (or ro_proposed)
*Long term
**remove asserted links that would be redundant with links implied from xps (go_write, go_ext)
**use "assert implied links" to generate version with asserted links instead of intersection_of tags (downsteam files)


==Meetings==


[[XP:Meeting XPs 2008-12-11]]





Latest revision as of 07:17, 12 June 2009

  • Start with internal xps first - do external later. Files distributed as follows:
XP:biological process_xp_self - Jen & David
XP:cellular component_xp_go - Midori
XP:cellular component xp self - Jane
XP:molecular function xp regulators - David & Tanya
XP:biological process xp regulation - David & Tanya
XP:biological process xp multi organism process - Jane
XP:biological process xp cellular component - Midori
  • First phase will be transition phase = files being edited outside of GO files. Essentially the person in charge of that file will check all of the XP definitions are okay, and edit as they see fit. The editing will probably be done by hand in the text file and committed back to cvs.
  • Minimise duration of transition phase: we will aim for XPs being in the editors file at start of Jan 2009. This will be the second phase. Note that this is dependent on OE2 being released unless we use the OE2 beta for editing the editors file.
  • Start the editor training (i.e. editors outside of editorial office) before Jan 2009: one WebEx training session some time before Christmas
  • End of March 2009 - final phase = XPs in GO users file & OE2 released!
See also Chris's email 7-11-2008

Questions we explored in this meeting:

Q/ Annotators have asked if new cross product terms will always need to be instantiated in the ontology or if they might one day be able to do, for example, a binding term annotation by annotating to the GO term 'binding' and the ChEBI term for the substance bound.

A/ The conclusion was that we will always instantiate cross product terms in the ontology. For such simple term requests we could in the future allow annotators to automatically request terms by just quoting the ids for the constituent terms and having software generate the terms overnight. The terms would be checked by curators, but the process would be essentially automatic and almost immediate.

(additional answer: see also Annotation_Cross_Products. Yes, we will always create cross-product terms within the ontology, but I would qualify that and say "within reason". We would not create terms for abritrary combinations. What is arbitrary is partially subjective and open to discussion and may evolve over time. For arbitrary combinations of terms, the xp can be done at annotation time --CJM)

Q/ What do we need from OE2 that we don't have in OE1?

A/

  • We need the new reasoner's output to be checked to see that it doesn't have any built in assumptions that are biologically wrong. This means that we need to check through the intersection links that it is producing and see if they are correct. We have assigned the files to various of the curators so that this process can begin.
  • We hoped to have the filtered save mechanism sophisticated enough to save out the files in the form that they were loaded in. However, this is some way off and we think it would be much more sensible to have the cross product files filtered out between the editor version and the public files using a script.
  • The cross product tab in the text editor component of OE2 should do the same things that the same component did in OE1.
  • The reasoner should be fast enough to run about once week so that it can generate implied links. It currently takes about 4 minutes Jennifer's laptop.

Q/ What will the curators do with cross products once they are fully released?

A/ When a new term is added the curators will check to see if it might also require intersection links. If it does then these will be added. Periodically the reasoner will be run to see if it detects any further relationships that can be inferred from these cross product intersection tags. If any new relationships are produced then the curators will check them and allow them to be saved into the file if they make sense. If they do not make sense then the information will be fed back to Chris Mungall to update the reasoner logic.

A walk-through of handling one set of Cross products

Chris has made a list of the cross product sets on this page:

http://wiki.geneontology.org/index.php/Category:Cross_Products

Choose a set

If you choose the set you are interested in and click the link in the table, then that takes you to a page all about that set.

In this example we will assume that you are using the XP:biological_process_xp_self set.

Read about the set and load the files

On the page is some information about the links and the import file that shows which files to load to start looking at the interesction tags. For the XP:biological_process_xp_self set the import file looks like this:

import: biological_process_xp_self.obo
import: ../../ontology/editors/gene_ontology_write.obo

To load the files you will first need to check them out from cvs. You can find out which to check out by looking at the paths in the import file.

Check the links

For the biological_process_xp_self.obo file we can now make all edits in OBO-Edit. For some other files we can currently check the links in OBO-Edit2 but any alterations must be made in the text file of intersection links. Any alterations should be reported to Chris by e-mail.

It can be helpful to show terms with intersection tags highlighted in bold. To do this use the global render shown below.

Save the files

To save the intersection links and live file separately it is necessary to use filters in the save dialog and then process one of the files with a perl script.

First set up the filters below.

The first filter produces a file much like the usual live file. However, the file contains only those stanzas that include an intersection tag.

The second filter also saves out a file much like the normal live file. However, the link filter ensures that only those relationship lines that do not include an intersection tag are saved into the file.

Next process the intersection file (in this case biological_process_xp_self-edited-filtered.obo) using obo-filter-tags.pl. This script is provided as part of go-perl. To find the script, check out go-perl from GO SourceForge CVS, or just check out the obo-filter-tags.pl script. You can also check out go-perl from cvs on the command line or in a Windows CVS GUI.

To process the file in a unix environment use the following command:

perl obo-filter-tags.pl -t id -t intersection_of --no_header 
biological_process_xp_self-edited-filtered.obo > biological_process_xp_self.obo

Or in windows you can run_obo-filter-tags.pl_in_Eclipse.

After these filtering steps the two files will closely resemble the original formats of the intersection and live files. However you should examine a diff carefully and make any further adjustments that are needed. There may be relationship Typedefs at the bottom of the live file and there may be extra lines in the headers.

Additional workflow notes

Plan as of June 2009

  • Short term (i.e. at least next several months)
    • leave both asserted links and intersection_of tags in go_write and go_ext, even if they're redundant
    • strip intersection_of tags from files downstream of go_ext
    • include [Typedef] stanzas for relations used, but don't include the rest of RO (or ro_proposed)
  • Long term
    • remove asserted links that would be redundant with links implied from xps (go_write, go_ext)
    • use "assert implied links" to generate version with asserted links instead of intersection_of tags (downsteam files)

Meetings

XP:Meeting XPs 2008-12-11



Back to Cross Product Guide