OBO Parser - Getting Started

From GO Wiki
Jump to: navigation, search

Java Event Based Parser for OBO - Getting Started:

The Java event-based parser is part of the OBO-Edit source code, in the org.geneontology.oboedit.dataadapter package. The OBO-Edit sources are available from our sourceforge CVS repository. For information on accessing the repository, go to:

   http://sourceforge.net/cvs/?group_id=36855

We apologize in advance for the very sparse documentation in these source files.

The classes of interest are:

  • GOBOParser
  • DefaultGOBOParser
  • GOBOParseEngine

The basic idea is that GOBOParseEngine reads and parses a Collection of OBO files to generate events (like "readID" and "readDefinition"). Each GOBOParseEngine is associated with an implementation of GOBOParser. Each time GOBOParseEngine generates an event, the corresponding GOBOParser method is called. Thus, if GOBOParseEngine sees the line "name: kinase" in an OBO file, it will call GOBOParseEngine.readName("kinase", null).

DefaultGOBOParser is an implementation of GOBOParser that populates the OBO-Edit datamodels from an OBO file. If you want to use OBO-Edit's datamodels, you can use DefaultGOBOParser like so:

    public static OBOSession getSession(String path) {
        DefaultGOBOParser parser = new DefaultGOBOParser();
        GOBOParseEngine engine = new GOBOParseEngine(parser);
        //GOBOParseEngine can parse several files at once
	//and create one munged-together ontology,
	//so we need to provide a Collection to the setPaths() method
        Collection paths = new LinkedList();
        paths.add(path);
        engine.setPaths(paths);
        engine.parse();
        OBOSession session = parser.getSession();
        return session;
    }

If you're populating a database, or doing something else where it would just be a waste of memory to use the OBO-Edit datamodels, you can create your own implementation of GOBOParser, and skip the datamodel generation step altogether.