Full Text Indexing Progress: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 2: Line 2:


There are two separate fronts of progress for FTI. The first is in the
There are two separate fronts of progress for FTI. The first is in the
indexing system itself; this would include things like software used
indexing system itself ("system"); this would include things like
(Solr, Jetty, etc.), schema, deployment, hardware, and other low-level
software used (Solr, Jetty, etc.), schema, deployment, hardware, and
issues that are probably not going to be hugely important to
other low-level issues that are probably not going to be hugely
end-of-the-line users and programmers. The second is the consumption
important to end-of-the-line users and programmers. The second is the
and use of FTI. This would include the integration into various pieces
consumption and use of FTI ("software"). This would include the
of software, services built up around FTI, and (possibly) abstraction
integration into various pieces of software, services built up around
APIs.
FTI, and (possibly) abstraction APIs.


While there are some blurry points in this distinction (e.g. what
While there are some blurry points in this distinction (e.g. what
Line 16: Line 16:


=System Progress=
=System Progress=
==Goals==
We would like to have a


==Current==
==Current==


This is the current of affairs.
===Schema===
 
Production schema: [http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/solr/solr/go-data-config.xml?revision=2881&view=markup]
 
Lucene schema: [http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/solr/solr/schema.xml?revision=2934&view=markup]


===Hardware===
===Hardware===
Line 27: Line 35:
used to test ways of integrating core software to use FTI (see below).
used to test ways of integrating core software to use FTI (see below).


===Schema===
==Past Experiments==
 
Production schema:
http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/solr/solr/go-data-config.xml?revision=2881&view=markup


Lucene schema:
http://geneontology.svn.sourceforge.net/viewvc/geneontology/java/solr/solr/schema.xml?revision=2934&view=markup


=Software Progress=
=Software Progress=

Revision as of 17:42, 21 September 2010

Overview

There are two separate fronts of progress for FTI. The first is in the indexing system itself ("system"); this would include things like software used (Solr, Jetty, etc.), schema, deployment, hardware, and other low-level issues that are probably not going to be hugely important to end-of-the-line users and programmers. The second is the consumption and use of FTI ("software"). This would include the integration into various pieces of software, services built up around FTI, and (possibly) abstraction APIs.

While there are some blurry points in this distinction (e.g. what about a JSON service built directly into the engine), hopefully it will provide a logical way to divide most of the problems that will be faced.

System Progress

Goals

We would like to have a

Current

Schema

Production schema: [1]

Lucene schema: [2]

Hardware

Solr on Jetty is currently installed on a BBOP development workstation. While not really available for public use, it is being used to test ways of integrating core software to use FTI (see below).

Past Experiments

Software Progress

Design Progress

One

Experimental


Two

???

Target

Software

Current

Past

  • ...
  • ...
  • ...