Difference between revisions of "GOlr"

From GO Wiki
Jump to: navigation, search
(Loading the data)
Line 49: Line 49:
 
= Local installation (data and schema development) =  
 
= Local installation (data and schema development) =  
  
These instructions are intended to illustrate how to install the GOlr server software for AmiGO 2 on a Ubuntu 12.04 LTS Desktop. With careful reading, these instructions should be generalizable to other cases.
+
GOlr installation information and Solr installation help can be found [[GOlr:_Installation | here]].
 
 
== General notes ==
 
 
 
The following packages are required:
 
 
 
* openjdk-6-jdk
 
* jetty
 
* subversion
 
 
 
Jetty will need to be setup to start and listen to an external address if you want.
 
 
 
Also, you will need a version of Maven 3 (mvn) to use and build OWLTools if you are trying to load data or produce a schema--the one in Ubuntu for 12.04 is stuck at Maven 2 and not compatible. It is a fairly simple process to download it and make sure that it is in your path.
 
 
 
mkdir -p ~/local/src/java
 
cd ~/local/src/java
 
wget http://apache.cs.utah.edu/maven/maven-3/3.0.4/binaries/apache-maven-3.0.4-bin.tar.gz
 
tar -zxvf apache-maven-3.0.4-bin.tar.gz
 
 
 
Now just make sure that either ~/local/src/java/apache-maven-3.0.4/bin is in your path or explicitly call ~/local/src/java/apache-maven-3.0.4/bin/mvn each time. You'll probably want to add something like:
 
 
 
JAVA_HOME=/usr/lib/jvm/java-6-openjdk
 
PATH="${PATH}":"${HOME}"/local/src/java/apache-maven-3.0.4/bin
 
 
 
To your ~/.bashrc.
 
 
 
source ~/.bashrc
 
 
 
==Readying OWLTools==
 
 
 
First, we need to download and build OWLTools--it contains the loader code that GO uses for ontology and GAF files, as well as the tools to generate a new Solr schema from AmiGO 2 YAML config files. The code should be self-sufficient and easy to build on a machine with an internet connection.
 
 
 
mkdir -p ~/local/src/svn
 
cd ~/local/src/svn
 
svn checkout http://owltools.googlecode.com/svn/trunk/ owltools
 
 
 
To build:
 
 
 
cd ~/local/srv/svn/owltools/OWLTools-Parent/ && mvn clean package
 
 
 
Or, without unit tests (much quicker):
 
 
 
cd ~/local/src/svn/owltools/OWLTools-Parent/ && mvn clean package -DskipTests
 
 
 
The ''owltools'' binary should now be built, so you'll probably want to add it to your path by adding the following to your ~/.bashrc:
 
 
 
PATH="${PATH}":"${HOME}"/local/src/svn/owltools/OWLTools-Runner/bin
 
 
 
Then refresh:
 
 
 
source ~/.bashrc
 
 
 
==Solr with the GO Solr schema (GOlr)==
 
 
 
While we have Jetty installed from the Ubuntu repos, we have our own Solr (3.6, the one that comes with Ubuntu 12.04 is ancient) in the GOlr repo. Let's get that first and look at it; even if we don't use the GOlr repo's supplied schema.xml, some of the information in there can be helpful. All GOlr source is in the AmiGO 2 git repo: amigo/golr. If you have not already done so, get the AmiGO 2 repo.
 
 
 
mkdir -p ~/local/src/git
 
cd ~/local/src/git
 
git clone git://github.com/kltm/amigo.git
 
cd amigo/golr
 
 
 
While we can install/handle our Jetty/Solr setup with the Makefile found in this directory, you may find it easier to just look at it and develop your own setup.
 
 
 
The solr.war for Jetty is at:
 
 
 
~/local/src/git/amigo/golr/solr/solr.war
 
 
 
Deploy this with the make command:
 
 
 
make golr-install
 
 
 
The heavy rollout lifting is done by the emacs script tools/golr.el.
 
 
 
Of course, it's important to remember that the most important end product of this section is to be using a correctly organized schema.xml file in a Solr 3.6 setup. However you accomplish that is fine. For example, the schema.xml file used by GO can be found at:
 
 
 
~/local/src/git/amigo/golr/solr/conf/schema.xml
 
 
 
Alternatively, you can generate a custom one by making sure that OWLTools is in the right place (check your environment and the Makefile) and running:
 
 
 
make golr-schema
 
 
 
Also, the solrconfig.xml file in ~/local/src/git/amigo/golr/solr/conf/ might be also be useful in your setup.
 
 
 
==Loading the data==
 
 
 
More memory is always welcome, but less is probably sufficient. The PANTHER section needs some special care. If you leave the PANTHER flags in the Makefile as they are now, you will need to make sure that they are there. Without checking out the whole (very large) tree, as a GOC member:
 
 
 
cd ~/local/src/svn
 
svn checkout svn+ssh://sjcarbon@ext.geneontology.org/share/go/svn/trunk/experimental/trees/panther_data panther_data
 
mkdir -p ~/local/src/svn/geneontology.org/trunk/experimental/trees/
 
mv panther_data ~/local/src/svn/geneontology.org/trunk/experimental/trees/
 
 
 
Or something similar--you just need the tree data in the right spot. These are optional and can be left out entirely if you don't need it.
 
 
 
As an example, we will use the Makefile profile to do a standard load:
 
 
 
make load-full
 
 
 
And that should be it.
 
 
 
Remeber that if you are using a caching server on the frontend that you will need to purge the caches.
 
  
 
=Collaborations=
 
=Collaborations=

Revision as of 13:24, 3 July 2013

Overview

This page describes the status of the public Solr index for the GO. This index will replace some of the query functionality for GOOSE as well as become the new backend for AmiGO 2 and other services. A preview server is now active at http://golr.berkeleybop.org.

Public beta

The preview server at http://golr.berkeleybop.org is currently in beta.

Some preview documentation is at http://cdn.berkeleybop.org/jsapi/bbop-js/docs/index.html

Use

Feel free to use it as you will and start to write for it. As this is still a little early, we are only loading the GO, CL, ECO, and taxslim, as well as non-IEA annotations from the GAFs. Please see the AmiGO_Labs caveats, as well as the downtime information on this page.

Implementation progress

We are now loading properly from owltools.

We have looked at nginx as a reverse proxy for speed and to prevent unauthorized access to non-select URLs on the Solr server, and found it to be good.

The GOlr server is now active at http://golr.berkeleybop.org (nginx front on stove).

We now also have http://amigo2.berkeleybop.org working on stove and backed by golr.berkeleybop.org.

We are now on Solr 3.6. At some point we'd like to get on 4.x for some improvements including pivot tables.

Schema progress

We are currently working towards a flexible schema as defined in the owltools code. As we roll it out for the AmiGO backend and as a replacement for common GOOSE queries, we expect to find holes in the schema, documented below.

Currently, the index is entirely populated by through an owltools command line program, using YAML files for configuration. For more information, see the Installation section below.

Rewriting old GOOSE query examples for the new GOlr Schema

Example queries on the wiki.

  • Some old queries require the use of Solr facets and some modification of code Seth has written to present simpler results queries (i.e. the facet results are not displayed).
  • Note: hierarchical queries that 'combine' facet queries can be handled by the facet.pivot functionality that is available in Solr 4. In the meantime, we'll see if we can devise a work-around.

Current holes in the GOlr schema

  • None listed at this time...

Downtime

Now that we have entered beta, downtimes are sporactic as we work with the software stack. As we approach production, the automatic update times will be decided and posted.

Local installation (data and schema development)

GOlr installation information and Solr installation help can be found here.

Collaborations

In order to make a truly useful system for the community at large, the Gene Ontology Consortium is working directly with other groups on GOlr tools, schema, and library development.

Virtual Fly Brain (VFB)

We're working with VFB on schema and ontology tools development to help enable rich ontology interaction and search at speed. This will also help push the envelope on the types of use cases that we might see.