AmiGO 2 Manual: Search: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page gives some examples of advanced search usage in the [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search AmiGO 2 beta]
= Introduction =


== Example Pages ==
AmiGO 2 now provides a powerful method of rapidly searching using pre-computed indexes. In addition to speed, Search provides boolean operators, wildcards, and fuzzy searches.


* [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/gene_product/PomBase:SPAC24B11.06c sty1 PomBase annotations]
= Basic Usage =
** See also [http://www.pombase.org/spombe/result/SPAC24B11.06c PomBase display]
* [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:0022008 neurogenesis]
* [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/gene_product/MGI:MGI:1197518 Aatk mouse annotations]
* [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:0030517 negative regulation of axon extension]


The largest initial difference, when compared to the traditional AmiGO search, is that results are returned incrementally as you type or manipulate the filters. It should be noted that the search only increments when typing when there are over three characters entered and when you are typing forward with standard characters (spaces, backspaces, and the like will not increment the search).


== Annotations ==
If you wish to search for an exact simple keyword, like ''nucleus'', you should either '''add a space after the word''' or use quotes around the word '''"nucleus"'''.


For annotation queries, start at
== Text filtering ==
http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search


The default tab is "annotation"
For our first example, let's say that you want to search for ''pleckstrin''. As you type it in, you will get results incrementally returned for ''ple'', ''plec'', and finally ''pleckstrin''. If you continued and entered:


Note that in this tab, you are querying on a per-annotation basis
pleckstrin domain


=== Neurogenesis ===
You would get a list of annotation that contain in their record ''pleckstrin'', ''domain'', or both (see [[#Boolean Operators (and/or/not)|#Boolean Operators]] below). Finally, if you want to search for the phrase ''pleckstrin domain'', where the words occur side by side, you would have to put the phrase in quotes:


Motivation: find Wnt family proteins involved in neurogenesis
"pleckstrin domain"


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
== Boolean Operators (and/or/-) ==
# Type '''neurogenesis''' into the filter box
# Use arrows to browse through results
# Filter by source - open "source" filter on left and select MGI
# Click on [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:0022008 neurogenesis] in one of the rows, taking you to the neurogenesis page
# Note that there are more results here - results include all descendants (isa, partof, regulates) of neurogenesis
# Open 'family' and select 'wnt related'


You may also use boolean logic and nesting in the search. For example, if you wanted to see all records that contain both ''pleckstrin'' and ''domain'' in them, you would enter:


=== Epithelia-related processes outside bilateria ===
pleckstrin and domain


Motivation: epithelial cells are often regarded to be eumetazoan innovations. Here we use AmiGO 2 to explore this hypothesis and check for potential errorsin GO.
To exclude ''domain'' from ''pleckstrin'' results, you would enter:


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search - the starting point is all annotations
pleckstrin and -domain
# Type: '''epithel''' in the filter box to the left of the results. The results are very inclusive - anything that matches this string in any field, or descendent of term that matches this string. Approx ~10k annotations
# Open "taxon" facet
# Select "-" next to bilateria


Explore the results using the left and right arrows. At the time of writing there are some obvious erroroneus annotations - can you spot them? You can filter out the IEA using the "evidence" filter.
To get everything that has, ''pleckstrin'', ''domain'', or both, you would enter:


=== Cell Cycle Phases (annotation extensions) ===
pleckstrin or domain


Motivation: The cell cycle phase terms may soon be separated from the main GO process hierarchy. Rather than annotating to these directly, they will be annotated to indirectly, through either pre-composed TG generated terms that use "during <phase>" as differentia, or as annotation extensions. We can see this now with many PomBase annotations.
Note that:


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
pleckstrin or domain
# Select "PomBase" in "source" facet
pleckstrin domain
# Open "annotation extension" facet
# Select "cell cycle phase" (should be near top)
# Explore different annotations
# Try a more specific phase in the extension facet - e.g. M phase


=== Neuron-related processes (annotation extensions) ===
Are functionally equivalent--spaces between words are considered to be an implicit ''or''. This is the reason that you need to quote phrases when you want them to appear together.


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search - the starting point is all annotations
We can also nest our boolean search with parentheses. If you wanted to search for ''top'' in conjunction with either ''alpha'' or ''beta'', you would enter:
# Open "annotation extension" facet
# Click "more"
# Search (e.g. ctrol+F in browser) for neuron and select "+"
# Explore deeper in neuron hierarchy via "extension" facet


== Gene Products ==
top and (alpha or beta)


''in progress''
For more about the results from ''top'' (and why there are not as many as you think), please see the next section.


For gene product queries, go to
== Wildcards ==
http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
And select "Gene/products" or follow this [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search?golr_class=bbop_bio link].


The default mode of the Search component is to apply a wildcard ("*") to the end of any search token that does not have a space after it.


=== Limb Morphogenesis ===
== Filtering  and Facets ==


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search - then select "gene products"
TODO
# Type "limb morphogenesis" in the filter field
# Open "involved in" and click "+" on "limb morphogenesis"
# ...


== Ontology ==
The annotations, terms, and gene product searches have different (and hopefully self-explanatory) filters that you can apply to them to reduce the number of returned results. Let's say that you've search for the phrase:


For ontology queries, go to
"pleckstrin domain"
http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
And select "Ontology" or follow this [http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search?golr_class=bbop_ont link].


Alternatively you can query from the main search bar: http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo
And wish to just see the results for RGD. You can do this by clicking the "RGD" item in the "Data source" filter box. Multiple filters can be added by holding a control key (varies by operating system and browser) while clicking on filters. Any combination of filters may be added; to remove a filter set, click on "No filter" item, the first item in each filter set.


=== neurogenesis ===


Motivation: explore neurogenesis
= Advanced Usage =


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo
For full information about the query syntax and special characters, please see the [http://lucene.apache.org/java/1_4_3/queryparsersyntax.html Lucene documentation].
# Type "neurogenesis"
# As you are typing, autocomplete should provide suggestions, with the desired term at the top
# Select '''neurogenesis'''
# This should take you here: http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/term/GO:0022008


From here you can:
== Additional Search Fields ==


# Explore annotations - use the facets on the left side to filter your search
TODO
# Explore neighboring terms - use "inferred tree view" for a compacted tree display


=== Non-nuclear chromosomes ===
See the schema details introspection tool.


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
== Fuzzy Searches and Levenshtein Distance ==
# Select "Ontology" tab
# Open "source" facet
# Select 'cell component'
# Open part-of/involved in facet
# Select + chromosome
# In part-of/involved in facet, select "-" next to nucleus (negative filter)
#


This section is not yet complete. In the meantime, please see the [http://lucene.apache.org/java/2_9_4/queryparsersyntax.html Lucene documentation]. All of the documentation there should be valid for AmiGO.


=== All obsolete terms in subsets ===
==Exotic Searches==


Motivation: ontology subset maintainers may wish to check if there are stale terms in subsets
In addition to the types of searches listed above, there are also more exotic types of search that involve ranges, proximity, optional wildcards, boosting, result weights, and others. Unfortunately, these are outside the scope of this document, but are part of the standard syntax. For a more in-depth look at these, please see the [http://lucene.apache.org/java/2_9_4/queryparsersyntax.html Lucene documentation].


# Start at http://amigo2.berkeleybop.org/cgi-bin/amigo2/amigo/search
= Example Usage =
# Select "Ontology" tab
# Open "subset" facet
# Select a GO slim - e.g. goslim_prok (note: some will be from other ontologies; use the "source" facet to eliminate these)
# Open "obsoletion" facet - you can see the distribution between obsolete / non-obsolete. Select "+" next to "true" (or "-" next to false)


Note you can also explore boolean combinations of slims - e.g. terms in both yeast and prok slim
Example search flows using AmiGO 2 can be found [[AmiGO_2_Manual:_Search_Examples | here]].


 
[[Category:Software]]
[[Category:AmiGO]]
[[Category:AmiGO_2_Manual]]
[[Category:AmiGO_2]]
[[Category:AmiGO_2]]

Latest revision as of 04:12, 1 October 2015

Introduction

AmiGO 2 now provides a powerful method of rapidly searching using pre-computed indexes. In addition to speed, Search provides boolean operators, wildcards, and fuzzy searches.

Basic Usage

The largest initial difference, when compared to the traditional AmiGO search, is that results are returned incrementally as you type or manipulate the filters. It should be noted that the search only increments when typing when there are over three characters entered and when you are typing forward with standard characters (spaces, backspaces, and the like will not increment the search).

If you wish to search for an exact simple keyword, like nucleus, you should either add a space after the word or use quotes around the word "nucleus".

Text filtering

For our first example, let's say that you want to search for pleckstrin. As you type it in, you will get results incrementally returned for ple, plec, and finally pleckstrin. If you continued and entered:

pleckstrin domain

You would get a list of annotation that contain in their record pleckstrin, domain, or both (see #Boolean Operators below). Finally, if you want to search for the phrase pleckstrin domain, where the words occur side by side, you would have to put the phrase in quotes:

"pleckstrin domain"

Boolean Operators (and/or/-)

You may also use boolean logic and nesting in the search. For example, if you wanted to see all records that contain both pleckstrin and domain in them, you would enter:

pleckstrin and domain

To exclude domain from pleckstrin results, you would enter:

pleckstrin and -domain

To get everything that has, pleckstrin, domain, or both, you would enter:

pleckstrin or domain

Note that:

pleckstrin or domain
pleckstrin domain

Are functionally equivalent--spaces between words are considered to be an implicit or. This is the reason that you need to quote phrases when you want them to appear together.

We can also nest our boolean search with parentheses. If you wanted to search for top in conjunction with either alpha or beta, you would enter:

top and (alpha or beta)

For more about the results from top (and why there are not as many as you think), please see the next section.

Wildcards

The default mode of the Search component is to apply a wildcard ("*") to the end of any search token that does not have a space after it.

Filtering and Facets

TODO

The annotations, terms, and gene product searches have different (and hopefully self-explanatory) filters that you can apply to them to reduce the number of returned results. Let's say that you've search for the phrase:

"pleckstrin domain"

And wish to just see the results for RGD. You can do this by clicking the "RGD" item in the "Data source" filter box. Multiple filters can be added by holding a control key (varies by operating system and browser) while clicking on filters. Any combination of filters may be added; to remove a filter set, click on "No filter" item, the first item in each filter set.


Advanced Usage

For full information about the query syntax and special characters, please see the Lucene documentation.

Additional Search Fields

TODO

See the schema details introspection tool.

Fuzzy Searches and Levenshtein Distance

This section is not yet complete. In the meantime, please see the Lucene documentation. All of the documentation there should be valid for AmiGO.

Exotic Searches

In addition to the types of searches listed above, there are also more exotic types of search that involve ranges, proximity, optional wildcards, boosting, result weights, and others. Unfortunately, these are outside the scope of this document, but are part of the standard syntax. For a more in-depth look at these, please see the Lucene documentation.

Example Usage

Example search flows using AmiGO 2 can be found here.