Difference between revisions of "GOlr"

From GO Wiki
Jump to: navigation, search
(Rewriting old GOOSE query examples for the new GOlr Schema)
m
 
(57 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
[[Category:Software]]
 +
[[Category:GOlr]]
 
=Overview=
 
=Overview=
  
 
This page describes the status of the public Solr index for the GO. This index will replace some of the query functionality for GOOSE as well as become the new backend for AmiGO 2 and other services. A preview server is now active at http://golr.berkeleybop.org.
 
This page describes the status of the public Solr index for the GO. This index will replace some of the query functionality for GOOSE as well as become the new backend for AmiGO 2 and other services. A preview server is now active at http://golr.berkeleybop.org.
  
=Public alpha=
+
=Public beta=
  
The preview server at http://golr.berkeleybop.org is currently in alpha.
+
The preview server at http://golr.berkeleybop.org is currently in beta.
  
Some preview documentation is at http://api.berkeleybop.org .
+
Some preview documentation is at http://cdn.berkeleybop.org/jsapi/bbop-js/docs/index.html
  
 
==Use==
 
==Use==
Line 24: Line 26:
 
We now also have http://amigo2.berkeleybop.org working on stove and backed by golr.berkeleybop.org.
 
We now also have http://amigo2.berkeleybop.org working on stove and backed by golr.berkeleybop.org.
  
We are now on Solr 3.6. At some point we'd like to get on 4.x for some improvements including pivot tables.
+
We are now on Solr 3.6.
  
 
==Schema progress==
 
==Schema progress==
Line 30: Line 32:
 
We are currently working towards a flexible schema as [http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/gold/solr/conf/schema.xml?revision=4429&content-type=text%2Fplain defined] in the owltools code. As we roll it out for the AmiGO backend and as a replacement for common GOOSE queries, we expect to find holes in the schema, documented below.
 
We are currently working towards a flexible schema as [http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/gold/solr/conf/schema.xml?revision=4429&content-type=text%2Fplain defined] in the owltools code. As we roll it out for the AmiGO backend and as a replacement for common GOOSE queries, we expect to find holes in the schema, documented below.
  
Currently, the index is entirely populated by through an owltools command line program, using [https://code.google.com/p/owltools/source/browse/#svn%2Ftrunk%2FOWLTools-Solr%2Fsrc%2Fmain%2Fresources YAML] files for configuration. The Solr schema.xml generation looks like:
+
Currently, the index is entirely populated by through an owltools command line program, using [https://code.google.com/p/owltools/source/browse/#svn%2Ftrunk%2FOWLTools-Solr%2Fsrc%2Fmain%2Fresources YAML] files for configuration. For more information, see the ''Installation'' section below.
 
 
./OWLTools-Runner/bin/owltools --solr-config /home/sjcarbon/local/src/svn/owltools/OWLTools-Solr/src/main/resources/ont-config.yaml /home/sjcarbon/local/src/svn/owltools/OWLTools-Solr/src/main/resources/bio-config.yaml /home/sjcarbon/local/src/svn/owltools/OWLTools-Solr/src/main/resources/ann-config.yaml /home/sjcarbon/local/src/svn/owltools/OWLTools-Solr/src/main/resources/ann_ev_agg-config.yaml --solr-schema-dump
 
 
 
With the actual population along the lines of:
 
 
 
owltools http://purl.obolibrary.org/obo/go.owl http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl http://purl.obolibrary.org/obo/cl.owl http://purl.obolibrary.org/obo/eco.owl --merge-support-ontologies --reasoner elk --solr-url http://localhost:8080/solr/ --solr-purge --solr-config /home/bbop/local/src/svn/owltools/OWLTools-Solr/src/main/resources/ont-config.yaml --solr-load-ontology --solr-load-panther /optional/wherever --solr-load-gafs /srv/tmp/gene_association.mgi
 
 
 
For the sake of completeness, a development load that we often run looks like:
 
 
 
./OWLTools-Runner/bin/owltools http://purl.obolibrary.org/obo/go.owl http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl http://purl.obolibrary.org/obo/cl.owl http://purl.obolibrary.org/obo/eco.owl --merge-support-ontologies --reasoner elk --solr-url http://localhost:8080/solr/ --solr-purge --solr-config /home/sjcarbon/local/src/svn/owltools/OWLTools-Solr/src/main/resources/ont-config.yaml --solr-load-ontology --solr-load-panther ~/tmp/tree --solr-load-gafs http://www.geneontology.org/gene-associations/gene_association.GeneDB_Spombe.gz http://www.geneontology.org/gene-associations/gene_association.dictyBase.gz
 
  
 
+
==Query examples for the GOlr Schema==
==Rewriting old GOOSE query examples for the new GOlr Schema==
 
  
 
Example queries on the [http://wiki.geneontology.org/index.php/Example_Solr_Queries wiki].
 
Example queries on the [http://wiki.geneontology.org/index.php/Example_Solr_Queries wiki].
Line 56: Line 47:
 
== Downtime ==
 
== Downtime ==
  
Rebuilds start at 10pm PDT (6am BST; 5pm NZST), so there may be some data gaps in there.
+
Now that we have entered beta, downtimes are sporactic as we work with the software stack. As we approach production, the automatic update times will be decided and posted.
 
 
At the time of this writing, this process is taking a few hours.
 
 
 
= Installation for development =
 
 
 
These instructions are intended to illustrate how to install the GOlr server software for AmiGO 2 on a Ubuntu 12.04 LTS Desktop. With careful reading, these instructions should be generalizable to other cases.
 
 
 
== General notes ==
 
 
 
The following packages are required:
 
 
 
* ???
 
 
 
Also, you will need a version of Maven 3 (mvn) to use and build OWLTools if you are trying to load data. It is a fairly simple process to download it and make sure that it is in your path.
 
 
 
==Getting the GO Solr schema (GOlr) working==
 
 
 
There are two options for this.
 
 
 
===Generating the schema from AmiGO 2 (geneontology repo)===
 
 
 
This software contains the GO-specific things that BBOP JS does not support--the client (web server) code for AmiGO 2. For the sake of convenience (and using a similar directory layout to the developers), we'll get it like:
 
 
 
mkdir -p ~/local/src/svn
 
cd ~/local/src/svn
 
svn co https://geneontology.svn.sourceforge.net/svnroot/geneontology/AmiGO/trunk geneontology/AmiGO/trunk
 
 
 
===Using the premade GOlrs schema (geneontology repo)===
 
 
 
Server code and utilities for handling the GO Solr schema.
 
 
 
svn co https://geneontology.svn.sourceforge.net/svnroot/geneontology/golr geneontology/golr
 
 
 
This is just utility code and the GO data schema, no building is necessary.
 
 
 
==OWLTools==
 
 
 
This contains the loader code that GO uses for ontology and GAF files.
 
 
 
svn checkout http://owltools.googlecode.com/svn/trunk/ owltools-read-only
 
 
 
===Building===
 
 
 
The code should be self-sufficient on a machine with an internet connection.
 
 
 
cd WHEREVER/owltools/OWLTools-Parent/ && mvn clean package
 
 
 
Or, without unit tests (much quicker):
 
 
 
cd WHEREVER/owltools/OWLTools-Parent/ && mvn clean package -DskipTests
 
 
 
The ''owltools'' binary that we'll be referring to later will be in:
 
 
 
WHEREVER/owltools/OWLTools-Runner/bin/owltools
 
 
 
This should either be in your path or you will have to make this path explicit.
 
 
 
==Readying Solr==
 
 
 
While we have Jetty installed from the repos, we have our own Solr (3.6) in the GOlr repo as the one that comes with Ubuntu is ancient. While we install/handle our Jetty/Solr setup with the golr.el script found in the GOlr repo, you may find it easier to just look at the script and copy its actions unless you're really into Emacs.
 
  
The most important part is using the correct schema.xml (derived from YAML files using OWLTools, but that's [http://wiki.geneontology.org/index.php/GOlr another story]). The solrconfig.xml is also there and might be useful.
+
= Local installation (data and schema development) =
  
==Loading the data==
+
GOlr installation information and Solr installation help can be found [[GOlr:_Installation | here]].
  
As an example, we will load four ontologies useful to GO right now and a single GAF file for annotations onto the localhost Solr instance. Please note the ''WHEREVER'', the other resources are on the web.
+
=Collaborations=
  
OWLTOOLS_MEMORY=32G owltools http://purl.obolibrary.org/obo/go.owl http://purl.obolibrary.org/obo/ncbitaxon/subsets/taxslim.owl http://purl.obolibrary.org/obo/cl.owl http://purl.obolibrary.org/obo/eco.owl --merge-support-ontologies --reasoner elk --solr-url http://localhost:8080/solr/ --solr-purge --solr-config WHEREVER/owltools/OWLTools-Solr/src/main/resources/ont-config.yaml --solr-load-ontology --solr-load-panther ~/tmp/PANTHER7.2_HMM_classifications ~/tmp/tree --solr-load-gafs http://www.geneontology.org/gene-associations/gene_association.GeneDB_Lmajor.gz
+
In order to make a truly useful system for the community at large, the Gene Ontology Consortium is working directly with other groups on GOlr tools, schema, and library development.
  
More memory is always welcome, but less is probably sufficient. The panther section
+
==Virtual Fly Brain (VFB)==
  
--solr-load-panther ~/tmp/PANTHER7.2_HMM_classifications ~/tmp/tree
+
We're working with [http://www.virtualflybrain.org VFB] on schema and ontology tools development to help enable rich ontology interaction and search at speed. This will also help push the envelope on the types of use cases that we might see.
 
 
is optional and experimental--it can be left out entirely.
 
 
 
 
 
[[Category:SWUG Projects]]
 
[[Category:Software]]
 
[[Category:AmiGO]]
 
[[Category:Software Progress]]
 
[[Category:Seth]]
 
[[Category:Mary]]
 

Latest revision as of 07:40, 12 August 2019

Overview

This page describes the status of the public Solr index for the GO. This index will replace some of the query functionality for GOOSE as well as become the new backend for AmiGO 2 and other services. A preview server is now active at http://golr.berkeleybop.org.

Public beta

The preview server at http://golr.berkeleybop.org is currently in beta.

Some preview documentation is at http://cdn.berkeleybop.org/jsapi/bbop-js/docs/index.html

Use

Feel free to use it as you will and start to write for it. As this is still a little early, we are only loading the GO, CL, ECO, and taxslim, as well as non-IEA annotations from the GAFs. Please see the AmiGO_Labs caveats, as well as the downtime information on this page.

Implementation progress

We are now loading properly from owltools.

We have looked at nginx as a reverse proxy for speed and to prevent unauthorized access to non-select URLs on the Solr server, and found it to be good.

The GOlr server is now active at http://golr.berkeleybop.org (nginx front on stove).

We now also have http://amigo2.berkeleybop.org working on stove and backed by golr.berkeleybop.org.

We are now on Solr 3.6.

Schema progress

We are currently working towards a flexible schema as defined in the owltools code. As we roll it out for the AmiGO backend and as a replacement for common GOOSE queries, we expect to find holes in the schema, documented below.

Currently, the index is entirely populated by through an owltools command line program, using YAML files for configuration. For more information, see the Installation section below.

Query examples for the GOlr Schema

Example queries on the wiki.

  • Some old queries require the use of Solr facets and some modification of code Seth has written to present simpler results queries (i.e. the facet results are not displayed).
  • Note: hierarchical queries that 'combine' facet queries can be handled by the facet.pivot functionality that is available in Solr 4. In the meantime, we'll see if we can devise a work-around.

Current holes in the GOlr schema

  • None listed at this time...

Downtime

Now that we have entered beta, downtimes are sporactic as we work with the software stack. As we approach production, the automatic update times will be decided and posted.

Local installation (data and schema development)

GOlr installation information and Solr installation help can be found here.

Collaborations

In order to make a truly useful system for the community at large, the Gene Ontology Consortium is working directly with other groups on GOlr tools, schema, and library development.

Virtual Fly Brain (VFB)

We're working with VFB on schema and ontology tools development to help enable rich ontology interaction and search at speed. This will also help push the envelope on the types of use cases that we might see.