Difference between revisions of "OBO-Edit: Proposals for Designing a New Filtering Interface"

From GO Wiki
Jump to: navigation, search
(Proposal 1: Widen/Narrow Searches)
(Proposal 1: Widen/Narrow Searches)
Line 62: Line 62:
 
The following are proposals for a new complex query interface. Note that we don't have to just pick one of these. We could provide any or all of these query designers, and users can choose the designer that best matches their taste.
 
The following are proposals for a new complex query interface. Note that we don't have to just pick one of these. We could provide any or all of these query designers, and users can choose the designer that best matches their taste.
  
===Proposal 1: Widen/Narrow Searches===
+
===Proposal 1: Set-Based Searching===
  
In this proposal, users create a complex search by chaining together basic searches with "widen" and "narrow" operations. The interface might look like this:
+
In this proposal, users create a complex search by chaining together basic searches with "widen" and "narrow" operations (or "union" and "intersection" operations, if you prefer). The interface might look like this:
  
 
[[Image:oboedit_widenarrow_filter_editor.gif]]
 
[[Image:oboedit_widenarrow_filter_editor.gif]]
 +
 +
The search shown above can be understood as:
 +
 +
# Perform the search "Name contains 'kinase'"
 +
# Remove all search results that do not match "Namespace equals 'GO'"
 +
# Add the results of the search "Is property"
 +
 +
This search is equivalent to: ((Name contains 'kinase') AND (Namespace equals 'GO')) OR Is property
 +
 +
This approach is called set-based searching, because it treats a complex search as a collection of intersection and union operations on various sets of search results.
 +
 +
'''Advantages:''' Easy for users to understand, and the gui is easy to build.
 +
'''Disadvantages:''' Not as expressive as the current complex filter builder. When you translate this kind of set-based search into an expression tree, it always assumes the following form:
 +
 +
<pre>
 +
        OP1
 +
        /  \
 +
    exp1  OP2
 +
          / \
 +
      exp2  OP3
 +
            / \
 +
        exp3  OP4
 +
              / \
 +
          exp4  exp5
 +
</pre>
 +
 +
where branching can only occur along one side of the tree. It's impossible to create a search like:
 +
 +
<code>
 +
(Name contains "kinase" AND Namespace equals "GO") OR (Is property AND Namespace equals "RO")
 +
</code>
 +
 +
because the expression tree requires complex branching:
 +
 +
<pre>
 +
          OR
 +
        /      \
 +
    AND      AND
 +
    / \      / \
 +
  exp1 exp2  exp3 exp4
 +
</pre>

Revision as of 12:08, 26 June 2007

I think we all can agree that the existing OBO-Edit filtering interface has its problems. Compound filter editing is too complex for most users, but not powerful enough for the most advanced users. This page will go over some possible different approaches. I hope that users group members will comment on this article with their opinions about which approach would work best...

Technical/Theory Changes

Changes to "Aspect"

Right now, the "Aspect" field allows you to search ancestors or descendants of a term, rather than the term itself. Unfortunately, this searches EVERY ancestor or descendant, and there's no way to constrain this by relationship type. I propose we add a modifier to Aspect that allows the user to specify a relationship type (or union of relationship types) to traverse when calculating the descendants and ancestors. If the reasoner is enabled, the filtering system will use the reasoner to do this calculation, which will be MUCH faster than the current method, and will give more intuitive results.

Script Filtering

Occasionally, a user wants to design a filter that just can't be expressed in the limited language of per-term boolean matches (which is basically how the existing filtering specifications work, with the Aspect feature tacked on as a useful extension). I suggest that we allow script-based searches for hard-core power users. The script-based searches would be specified in the OSL scripting language. So, for example, imagine someone wanted to find terms whose names contain the word "development", and that have a part_of relationship to a parent term whose name DOESN'T contain the word "development". They could write a search this way...

 function filter(term) {
    if (term.getName().indexOfIgnoreCase("development") == -1)
        return false;
    for link in term.getParents() {
        if (link.getType().getID().equals("part_of") &&
            link.getParent().indexOfIgnoreCase("development" == -1) {
           return true;
        }
    }
    return false;
 }

This kind of scripting is too complex for most users to bother with, but if a user DID write something like that, they could share their script with other users. Further, some of the schemes below allow simple searches to be combined to create more complex searches; in these cases, script-based searches could be dropped in as one of the simple searches, alongside the more conventional searches familiar to current OBO-Edit users.

Expression Filtering

It might be useful to provide a filtering mode where filters are specified as text expressions. The user would simply type in their query as a parsable text expression, like so:

  (not name contains "kinase") AND (ancestor(is_a, part_of) id equals "GO:xxxxxxx")

Interface Proposals

Changes to the Basic Query Designer

At its core, any query design system is going to need a component that is used to create simple searches (although these simple searches may later be used as building blocks for complex searches). In OBO-Edit, that simple component looks like this:

Oboedit current basic filter editor.gif

There are a couple of problems with this interface:

  1. Nothing is labeled
  2. Some rarely used advanced controls (like NOT and the Aspect dropdown) are always visible
  3. The search value field is totally unconstrained, even though some search criteria require a very specific sort of value

Suggested improvements:

  1. Hide the NOT and Aspect controls initially. Small buttons in the top right corner of the panel can be used to display them if necessary
  2. Label each control
  3. Constrain the text box values

This last one is the most technically tricky, but probably the most useful. If possible, the text box will attempt to autocomplete the text typed by the user. The autocomplete values will be the list of all values for that criterion in the currently loaded ontology. For numeric criteria (like "Number of parents"), no autocompletion will be attempted, but the text box will be constrained so that only numeric values can be entered.

Changes to the Complex Query Designer

The complex query designer is most users' least favorite part about OBO-Edit. Many users are confused by the tree editor for complex queries, and the prefix notation for complex queries doesn't make sense to non-computer scientists/mathematicians.

The following are proposals for a new complex query interface. Note that we don't have to just pick one of these. We could provide any or all of these query designers, and users can choose the designer that best matches their taste.

Proposal 1: Set-Based Searching

In this proposal, users create a complex search by chaining together basic searches with "widen" and "narrow" operations (or "union" and "intersection" operations, if you prefer). The interface might look like this:

Oboedit widenarrow filter editor.gif

The search shown above can be understood as:

  1. Perform the search "Name contains 'kinase'"
  2. Remove all search results that do not match "Namespace equals 'GO'"
  3. Add the results of the search "Is property"

This search is equivalent to: ((Name contains 'kinase') AND (Namespace equals 'GO')) OR Is property

This approach is called set-based searching, because it treats a complex search as a collection of intersection and union operations on various sets of search results.

Advantages: Easy for users to understand, and the gui is easy to build. Disadvantages: Not as expressive as the current complex filter builder. When you translate this kind of set-based search into an expression tree, it always assumes the following form:

         OP1
        /  \
     exp1  OP2
           / \
       exp2  OP3
             / \
         exp3  OP4
               / \
           exp4  exp5

where branching can only occur along one side of the tree. It's impossible to create a search like:

(Name contains "kinase" AND Namespace equals "GO") OR (Is property AND Namespace equals "RO")

because the expression tree requires complex branching:

           OR
        /      \
     AND       AND
     / \       / \
  exp1 exp2  exp3 exp4