File Format FAQ

From GO Public

Jump to: navigation, search

The GO Consortium makes the ontologies and annotations available for download in a range of formats. Please see the GO downloads section if you wish to download any specific file.

Why are the ontologies initially produced in OBO flat file format instead of XML?

The ontologies are initially produced in the specially designed OBO flat file format. They are converted to XML once a month for the convenience of users who require this facility. Both formats and many others are available in the GO downloads section.

We use the OBO flat file format because it is very much more human-readable, and also because the file is much smaller without the XML tags. This means that it is much quicker and easier for the curators to handle the file on a day-to-basis.

Why won't the RDF-XML file parse using RDF parsers?

The GO RDF-XML format was originally developed some time ago, before the advent of OWL. It has a few unusual features that render it more of a pseudo-rdf format.

The actual RDF is embedded within a <go:go> xml element - this should be stripped out before handing to RDF parsers.

Note that the GO RDF-XML conforms to a DTD, something that is not normally a requirement of RDF. This is because most people parse the file using conventional XML parsers rather than XML tools.

For some background on the RDF-XML export, see this email on gofriends

How can I generate files in the old GO flat file format?

The original GO flat file format is deprecated and will no longer be provided by the GO Consortium after 1st Aug 2009.

For cases where you still need to work with this file format GO provides a script - obo2flat - which converts from the GO flat file format to the newer OBO format.

obo2flat is a command-line, Java script that comes as part of the OBO-Edit package. For further instructions on how to use obo2flat, please see the OBO-Edit User Guide.

Personal tools