OBO-Edit: Proposals for Designing a New Filtering Interface

From GO Wiki
Jump to navigation Jump to search

I think we all can agree that the existing OBO-Edit filtering interface has its problems. Compound filter editing is too complex for most users, but not powerful enough for the most advanced users. This page will go over some possible different approaches. I hope that users group members will comment on this article with their opinions about which approach would work best...

Technical/Theory Changes

Changes to "Aspect"

Right now, the "Aspect" field allows you to search ancestors or descendants of a term, rather than the term itself. Unfortunately, this searches EVERY ancestor or descendant, and there's no way to constrain this by relationship type. I propose we add a modifier to Aspect that allows the user to specify a relationship type (or union of relationship types) to traverse when calculating the descendants and ancestors. If the reasoner is enabled, the filtering system will use the reasoner to do this calculation, which will be MUCH faster than the current method, and will give more intuitive results.

Script Filtering

Occasionally, a user wants to design a filter that just can't be expressed in the limited language of per-term boolean matches (which is basically how the existing filtering specifications work, with the Aspect feature tacked on as a useful extension). I suggest that we allow script-based searches for hard-core power users. The script-based searches would be specified in the OSL scripting language. So, for example, imagine someone wanted to find terms whose names contain the word "development", and that have a part_of relationship to a parent term whose name DOESN'T contain the word "development". They could write a search this way...

 function filter(term) {
    if (term.getName().indexOfIgnoreCase("development") == -1)
        return false;
    for link in term.getParents() {
        if (link.getType().getID().equals("part_of") &&
            link.getParent().indexOfIgnoreCase("development" == -1) {
           return true;
    return false;

This kind of scripting is too complex for most users to bother with, but if a user DID write something like that, they could share their script with other users. Further, some of the schemes below allow simple searches to be combined to create more complex searches; in these cases, script-based searches could be dropped in as one of the simple searches, alongside the more conventional searches familiar to current OBO-Edit users.

Interface Proposals

Changes to the Basic Query Designer

At its core, any query design system is going to need a component that is used to create simple searches (although these simple searches may later be used as building blocks for complex searches). In OBO-Edit, that simple component looks like this:

Oboedit current basic filter editor.gif

There are a couple of problems with this interface:

  1. Nothing is labeled
  2. Some rarely used advanced controls (like NOT and the Aspect dropdown) are always visible
  3. The search value field is totally unconstrained, even though some search criteria require a very specific sort of value

Suggested improvements:

  1. Hide the NOT and Aspect controls initially. Small buttons in the top right corner of the panel can be used to display them if necessary
  2. Label each control
  3. Constrain the text box values

This last one is the most technically tricky, but probably the most useful. If possible, the text box will attempt to autocomplete the text typed by the user. The autocomplete values will be the list of all values for that criterion in the currently loaded ontology. For numeric criteria (like "Number of parents"), no autocompletion will be attempted, but the text box will be constrained so that only numeric values can be entered.

Changes to the Complex Query Designer

The complex query designer is most users' least favorite part about OBO-Edit. Many users are confused by the tree editor for complex queries, and the prefix notation for complex queries doesn't make sense to non-computer scientists/mathematicians.

The following are proposals for a new complex query interface. Note that we don't have to just pick one of these. We could provide any or all of these query designers, and users can choose the designer that best matches their taste.

Proposal 1: Set-Based Searching

In this proposal, users create a complex search by chaining together basic searches with "widen" and "narrow" operations (or "union" and "intersection" operations, if you prefer). The interface might look like this:

Oboedit widenarrow filter editor.gif

The search shown above can be understood as:

  1. Perform the search "Name contains 'kinase'"
  2. Remove all search results that do not match "Namespace equals 'GO'"
  3. Add the results of the search "Is property"

This search is equivalent to: ((Name contains 'kinase') AND (Namespace equals 'GO')) OR Is property

This approach is called set-based searching, because it treats a complex search as a collection of intersection and union operations on various sets of search results.

Advantages: Easy for users to understand, and the gui is easy to build. Disadvantages: Not as expressive as the current complex filter builder. When you translate this kind of set-based search into an expression tree, it always assumes the following form:

        /  \
     exp1  OP2
           / \
       exp2  OP3
             / \
         exp3  OP4
               / \
           exp4  exp5

where branching can only occur along one side of the tree. It's impossible to create a search like:

(Name contains "kinase" AND Namespace equals "GO") OR (Is property AND Namespace equals "RO")

because the expression tree requires complex branching:

        /      \
     AND       AND
     / \       / \
  exp1 exp2  exp3 exp4

Proposal 2: Graph-Edited Tree Filters

This proposal suggests that we use exactly the same tree-based filter builder as before, but we use a graph editor to build it. I've thought of two possible approaches:

Proposal 2A: The Wiring Diagram

Users could build their searches by assembling a kind of "wiring diagram". In this "wiring diagram", there is an "input" (a term) and an "output" (whether or not it matches). Users build their filter by wiring the input to the output VIA a series of search criteria and boolean operations.

Here's a step-by-step walkthrough of how this might work. In this example, we'll use the wiring diagram editor to create the filter from the previous example:

(Name contains "kinase" AND Namespace equals "GO") OR (Is property AND Namespace equals "RO")

Here's the starting screen:

Oboedit wiring 2.gif

First, we'll add all the basic filters via the basic filter editor on top. Once they're added, the screen will look something like:

Oboedit wiring 1.gif

Now we add the boolean operators, and add wires linking our basic filters and the boolean operators to the output.

Oboedit wiring3.gif

We can even test our filter by dropping a term into the "<input>" box. We get immediate visual feedback that helps us debug our filters:

Oboedit wiring4.gif

Advantages: Once the user grasps the metaphor, this view closely models the way the filtering system actually works, and the editor can act as a powerful debugging tool for complex filters. Disadvantages: The electrical engineering metaphor may take some getting used to.

Proposal 2B: Graph-Based Editing

The wiring diagram can be simplified into a graph edited view like the ontology by removing the input and output objects and simplifying the wires:

Oboedit graph filter editor.gif

Advantages: Looks a lot like the OBO-Edit graph editor, so it may be easier to understand than the wiring diagram mode Disadvantages: Doesn't provide the debugging abilities of the wiring diagram editor, doesn't reinforce the input/output concepts of filter matching as well as the wiring diagram

Proposal 3: Nested Boxes

In this proposal, boolean operations are containers into which basic searches or other boolean operations can be dropped. A filter is designed by nesting basic searches within boolean operation boxes. The outermost box defines the search.

Example 3.1: A Simple Boolean Search

Let's say we want to create the relatively simple boolean expression:

Name contains "kinase" AND Namespace equals "GO"

We would start by creating the basic filters:

Oboedit nestedboxes 1.gif

Then, we would use the "AND" button to create an empty "AND" container:

Oboedit nestedboxes 2.gif

We drag our basic filters and drop them inside the "AND" container to complete our filter:

Oboedit nestedboxes 3.gif

Example 3.1: Building a More Complex Boolean Search from a Simple One

What if, at this point, we realize we actually want to create a more complicated filter, like the one from the previous examples:

 (Name contains "kinase" AND Namespace equals "GO") OR (Is property AND Namespace equals "RO")

You can just drop the existing filter into the new, more complicated filter! First, we create the new AND filter for Name contains "kinase" AND Namespace equals "GO", and then we add an empty OR container:

Oboedit nestedboxes 4.gif

Then we drag our two AND containers into our OR container, and we're done:

Oboedit nestedboxes 5.gif

Finally, notice that it's possible to add visual feedback for debugging/testing, just like in the wiring diagram approach. We could light up the various matching pieces of a filter in response to the current selection, for example. So if the user selected "obo_rel:part_of", the filter would be colored like so:

Oboedit nestedboxes 6.gif

Advantages: It's easy to use, and it doesn't assume anything about how a user wants to build their filters. You can use this method to build a complex filter from the inside out, or from the outside in. It's possible to provide visual feedback to help debug and test the filter, just like in the wiring diagram case. The interface even kind of resembles a Ven Diagram, which is a representation with which most users will have some experience.

Disadvantages: The way the debugging/testing visual feedback works takes some getting used to.

Proposal 4: Expression Filtering

We provide a filtering mode where filters are specified as text expressions. The user would simply type in their query as a parseable text expression, like so:

  (not name contains "kinase") AND (ancestor(is_a, part_of) id equals "GO:xxxxxxx")

The expression would be entered into a plain text field, but we can probably provide some autocompletion and syntax highlighting to make it a little easier to use.

Advantages: It's easy to understand, and it represents the filter in a very familiar way. This filter syntax can be used in command line tools, and possibly the OBO format itself.

Disadvantages: It's really hard to learn; users will have to master the syntax of the expressions, and then they'll need to be familiar with the library of available search comparisons and search criteria.