AmiGO 2 Manual: Search: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 6: Line 6:


The largest initial difference, when compared to the traditional AmiGO search, is that results are returned incrementally as you type or manipulate the filters. It should be noted that the search only increments when typing when there are over three characters entered and when you are typing forward with standard characters (spaces, backspaces, and the like will not increment the search).
The largest initial difference, when compared to the traditional AmiGO search, is that results are returned incrementally as you type or manipulate the filters. It should be noted that the search only increments when typing when there are over three characters entered and when you are typing forward with standard characters (spaces, backspaces, and the like will not increment the search).
If you wish to search for an exact simple keyword, like ''nucleus'', you should either '''add a space after the word''' or use quotes around the word '''"nucleus"'''.


== Text filtering ==
== Text filtering ==


For our first example, let's say that you want to search for "pleckstrin". As you type it in, you will get results incrementally returned for "ple", "plec", and finally "pleckstrin". If you continued and entered:
For our first example, let's say that you want to search for ''pleckstrin''. As you type it in, you will get results incrementally returned for ''ple'', ''plec'', and finally ''pleckstrin''. If you continued and entered:


  pleckstrin domain
  pleckstrin domain


You would get a list of gene products that contain in their record "pleckstrin", "domain", or both (see [[#Boolean Operators (and/or/not)|#Boolean Operators]] below). Finally, if you want to search for the phrase "pleckstrin domain", where the words occur side by side, you would have to put the phrase in quotes:
You would get a list of annotation that contain in their record ''pleckstrin'', ''domain'', or both (see [[#Boolean Operators (and/or/not)|#Boolean Operators]] below). Finally, if you want to search for the phrase ''pleckstrin domain'', where the words occur side by side, you would have to put the phrase in quotes:


  "pleckstrin domain"
  "pleckstrin domain"


== Boolean Operators (and/or/not) ==
== Boolean Operators (and/or/-) ==


You may also use boolean logic and nesting in the search. For example, if you wanted to see all records that contain both "pleckstrin" and "domain" in them, you would enter:
You may also use boolean logic and nesting in the search. For example, if you wanted to see all records that contain both ''pleckstrin'' and ''domain'' in them, you would enter:


  pleckstrin and domain
  pleckstrin and domain


To exclude "domain" from "pleckstrin" results, you would enter:
To exclude ''domain'' from ''pleckstrin'' results, you would enter:


  pleckstrin and not domain
  pleckstrin and -domain


To get everything that has, "pleckstrin", "domain", or both, you would enter:
To get everything that has, ''pleckstrin'', ''domain'', or both, you would enter:


  pleckstrin or domain
  pleckstrin or domain
Line 36: Line 38:
  pleckstrin domain
  pleckstrin domain


Are functionally equivalent--spaces between words are considered to be an implicit "or". This is the reason that you need to quote phrases when you want them to appear together.
Are functionally equivalent--spaces between words are considered to be an implicit ''or''. This is the reason that you need to quote phrases when you want them to appear together.


We can also nest our boolean search with parentheses. If you wanted to search for "top" in conjuction with either "alpha" or "beta", you would enter:
We can also nest our boolean search with parentheses. If you wanted to search for ''top'' in conjunction with either ''alpha'' or ''beta'', you would enter:


  top and (alpha or beta)
  top and (alpha or beta)


For more about the results from "top" (and why there are not as many as you think), please see the next section.
For more about the results from ''top'' (and why there are not as many as you think), please see the next section.


== Wildcards ==
== Wildcards ==


For example, let's sat that we search for "top". In our results, we can see that the fly gene that we were looking for, Topoisomerase 1 with the symbol Top1, is not in the results. Unlike the [[AmiGO_Manual:_OpenSearch | OpenSearch]] widgets available for AmiGO, Live Search does not automatically search for words where "top" is only part of it--you must explicitly instruct the search engine to look for these things by adding a wildcard at the end of the word: "*". By entering:
The default mode of the Search component is to apply a wildcard ("*") to the end of any search token that does not have a space after it.


  top*
== Filtering and Facets ==


You will now see the fly gene you were searching for returned as the first result. It is important to know that wildcards '''cannot''' be the first character in a word.
TODO
 
In addition to the '*' wildcard, there are several others that can do things like fuzzy searches or proximity. Please see [[#Advanced Usage]] below.
 
==Fields==
 
It is important to note that when searching, this new engine treats all of the information about a term or gene product as a single document by default. If you are interested in, say, the term "kinase activity" and entered it into the search box:
 
kinase activity
 
You might be disapointed to find that the first returned term is "JUN kinase kinase kinase activity". However, from the point of view of the search engine, with the triple kinase in the name and all of the kinase activity appearing in the synonyms, it seems like a really close match for what you are asking.
 
In order to narrow the search down to things that are just the phrase "kinase activity" in name of the term, you would enter:
 
  name:"kinase activity"
 
Similarly, for a gene_product, you might enter:
 
full_name:chocolate
 
While these two special fields may be the most common, there are a number of distinct special fields for terms and gene products. For a complete list of available fields, please see [[#Additional Search Fields]] below.
 
== Filtering ==


The term and gene product searches have different (and hopefully self-explanatory) filters that you can apply to them to reduce the number of returned results. Let's say that you've search for the phrase:
The annotations, terms, and gene product searches have different (and hopefully self-explanatory) filters that you can apply to them to reduce the number of returned results. Let's say that you've search for the phrase:


  "pleckstrin domain"
  "pleckstrin domain"
Line 80: Line 60:
And wish to just see the results for RGD. You can do this by clicking the "RGD" item in the "Data source" filter box. Multiple filters can be added by holding a control key (varies by operating system and browser) while clicking on filters. Any combination of filters may be added; to remove a filter set, click on "No filter" item, the first item in each filter set.
And wish to just see the results for RGD. You can do this by clicking the "RGD" item in the "Data source" filter box. Multiple filters can be added by holding a control key (varies by operating system and browser) while clicking on filters. Any combination of filters may be added; to remove a filter set, click on "No filter" item, the first item in each filter set.


== Filters and Facets ==
TODO


= Advanced Usage =
= Advanced Usage =
Line 91: Line 68:


TODO
TODO
See the schema details introspection tool.


== Fuzzy Searches and Levenshtein Distance ==
== Fuzzy Searches and Levenshtein Distance ==


This section is not yet complete. In the meantime, please see the [http://lucene.apache.org/java/1_4_3/queryparsersyntax.html Lucene documentation]. All of the documentation there should be valid for AmiGO.
This section is not yet complete. In the meantime, please see the [http://lucene.apache.org/java/2_9_4/queryparsersyntax.html Lucene documentation]. All of the documentation there should be valid for AmiGO.


==Exotic Searches==
==Exotic Searches==


In addition to the types of searches listed above, there are also more exotic types of search that involve ranges, proximity, optional wildcards, boosting, result weights, and others. Unfortunately, these are outside the scope of this document, but are part of the standard syntax. For a more in-depth look at these, please see the [http://lucene.apache.org/java/1_4_3/queryparsersyntax.html Lucene documentation].
In addition to the types of searches listed above, there are also more exotic types of search that involve ranges, proximity, optional wildcards, boosting, result weights, and others. Unfortunately, these are outside the scope of this document, but are part of the standard syntax. For a more in-depth look at these, please see the [http://lucene.apache.org/java/2_9_4/queryparsersyntax.html Lucene documentation].


= Example Usage =
= Example Usage =

Latest revision as of 04:12, 1 October 2015

Introduction

AmiGO 2 now provides a powerful method of rapidly searching using pre-computed indexes. In addition to speed, Search provides boolean operators, wildcards, and fuzzy searches.

Basic Usage

The largest initial difference, when compared to the traditional AmiGO search, is that results are returned incrementally as you type or manipulate the filters. It should be noted that the search only increments when typing when there are over three characters entered and when you are typing forward with standard characters (spaces, backspaces, and the like will not increment the search).

If you wish to search for an exact simple keyword, like nucleus, you should either add a space after the word or use quotes around the word "nucleus".

Text filtering

For our first example, let's say that you want to search for pleckstrin. As you type it in, you will get results incrementally returned for ple, plec, and finally pleckstrin. If you continued and entered:

pleckstrin domain

You would get a list of annotation that contain in their record pleckstrin, domain, or both (see #Boolean Operators below). Finally, if you want to search for the phrase pleckstrin domain, where the words occur side by side, you would have to put the phrase in quotes:

"pleckstrin domain"

Boolean Operators (and/or/-)

You may also use boolean logic and nesting in the search. For example, if you wanted to see all records that contain both pleckstrin and domain in them, you would enter:

pleckstrin and domain

To exclude domain from pleckstrin results, you would enter:

pleckstrin and -domain

To get everything that has, pleckstrin, domain, or both, you would enter:

pleckstrin or domain

Note that:

pleckstrin or domain
pleckstrin domain

Are functionally equivalent--spaces between words are considered to be an implicit or. This is the reason that you need to quote phrases when you want them to appear together.

We can also nest our boolean search with parentheses. If you wanted to search for top in conjunction with either alpha or beta, you would enter:

top and (alpha or beta)

For more about the results from top (and why there are not as many as you think), please see the next section.

Wildcards

The default mode of the Search component is to apply a wildcard ("*") to the end of any search token that does not have a space after it.

Filtering and Facets

TODO

The annotations, terms, and gene product searches have different (and hopefully self-explanatory) filters that you can apply to them to reduce the number of returned results. Let's say that you've search for the phrase:

"pleckstrin domain"

And wish to just see the results for RGD. You can do this by clicking the "RGD" item in the "Data source" filter box. Multiple filters can be added by holding a control key (varies by operating system and browser) while clicking on filters. Any combination of filters may be added; to remove a filter set, click on "No filter" item, the first item in each filter set.


Advanced Usage

For full information about the query syntax and special characters, please see the Lucene documentation.

Additional Search Fields

TODO

See the schema details introspection tool.

Fuzzy Searches and Levenshtein Distance

This section is not yet complete. In the meantime, please see the Lucene documentation. All of the documentation there should be valid for AmiGO.

Exotic Searches

In addition to the types of searches listed above, there are also more exotic types of search that involve ranges, proximity, optional wildcards, boosting, result weights, and others. Unfortunately, these are outside the scope of this document, but are part of the standard syntax. For a more in-depth look at these, please see the Lucene documentation.

Example Usage

Example search flows using AmiGO 2 can be found here.