Galaxy: Difference between revisions

From GO Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(15 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Current URL: http://galaxy.berkeleybop.org
Current URL: http://galaxy.geneontology.org


The GO website lists >50 tools for doing GO-based analyses but these are not well integrated. AmiGO/GOOSE offers some functionality such as slimming, enrichment, data extraction but these are difficult to chain together. A simple integrated environment in which tools could be linked in workflows would be beneficial to the community. Galaxy provides an environment for doing this
'''NOTE:''' At this time the GO Galaxy instance is for demo purposes only. Use at your own risk. The data here may not be kept fully up to date.
 
The GO website lists >50 tools for doing GO-based analyses but these are not well integrated. AmiGO/GOOSE offers some functionality such as slimming, enrichment, data extraction but these are difficult to chain together. A simple integrated environment in which tools could be linked in workflows would be beneficial to the community.  
<p>
<p>
The GO Galaxy instance is a place to do all your analyses in one place
The GO Galaxy instance is an environment to do all your analyses in one place


* term enrichment
* term enrichment
Line 9: Line 11:
* annotation file munging (e.g. mapping to slim)
* annotation file munging (e.g. mapping to slim)


== Source ==
== Using GO Galaxy ==
 
=== Sample Term Enrichment genesets ===
 
For testing you can use some existing sample sets:
 
# Open "Term Enrichment -> Get test geneset"
# Select sample set - e.g. mouse obesity
 
You may also want to fetch the corresponding GAF for this set
 
# Open "Annotation -> Fetch GAFs"
# Select the appropriate taxon ID for your set. E.g. for the mouse set, enter 10090
 
=== Basic Term Enrichment ===
 
# First upload your gene set. Select "Get data > upload file".
#* See above for sample testset
# Bring in the latest version of GO into your instance. Select "OBO/OWL Tools" and "fetch ontology". Type "GO" in the box.
## TIP: when the ontology comes in click on "attributes" (the pen symbol) and rename the ontology to "GO"
# Bring in the latest version of GO into your instance. Select "Annotation > Fetch associations"
#* The only field that needs to be filled in is the taxon ID (some common ones are listed at the bottom of the page)
#* This is going to be improved so that the species name (latin) or common name can be typed in
# Select "Term enrichment / GO termfinder"
## Enter the number of background genes in the genome, or an estimate
##* This is either the estimated number of genes in the genome, or for an array the number of genes on the chips
## Make sure your uploaded genes go into the "genes" selection
## Make sure GO and your GAF go into the respective fields
## Execute
 
the results here is a human-readable report file
 
=== Term Enrichment using Ontologizer ===
 
Follow the steps above, with the additional steps or modification
 
# Upload a population gene set.
## If you wish to consider all annotated genes, use "Text manipulation / cut" to select col3 from your GAF
# Select "Term enrichment / ontologizer"
## There are a variety of methods - consult the ontologizer site for details. try Parent-Child and Bonferoni as default
## Execute
 
Note that this produces two outputs - a tabular file that can be processed further and a visual display with enriched terms highlighted
 
=== Map2Slim ===
 
Mapping an annotation file to a slim (ontology subset) is a common operation
 
# Load an ontology (e.g. GO) and a GAF file - follow the steps above
# Select "Annotation > map2slim"
# Enter the name of your slim. This must be declared in the ontology file.
# Execute
 
The header of the processed GAF states that this is a mapped GAF, and tells you how many terms were mapped.
 
Note that the result is a valid GAF file and can be used anywhere a GAF can be used, including Term Enrichment, above
 
=== Map2Slim, custom, using AmiGO ===
 
This is similar to the above step, except instead of a pre-defined slim in the GO, we will generate our own set of IDs and use that
 
# Open "GOtools > AmiGO"
# Select the "Ontology" tab
# Open "source" and select "biological_process"
# Type "metabolic process" in the filter box to start your search
# Open isa/partof/regulates and click "+" next to "metabolic process" (should be ranked around 2nd)
# In "found entires" click "download to galaxy" (if not present, click to download to hard drive, save, and upload in galaxy)
# In galaxy, select "Annotation > map2slim, idlist"
## Select the file you uploaded
## Follow the steps for map2slim above
 
The result will be a GAF that is mapped to your terms
 
=== Map2Slim, custom, using a query ===
 
This is similar to the above step, we will use the GAF filter tool
 
# If you do not have a GAF in your workspace, get one using "Annotation -> fetch associations"
# Open "Annotation -> GAF filter"
# Enter a GO ID or term label (label must be exact). E.g. "metabolic process"
# Ensure your GAF is added
# Execute
 
The result will be a GAF that only includes annotations to the specified term or descendants.
 
Note that you can use this to filter by ontology - use a root term - e.g. "biological_process"
 
Variations:
 
To do a union query, you can run this multiple times and use "Annotation -> Merge GAFs"
 
=== Converting between OBO and OWL ===
 
# Bring in the latest version of the ontology into your instance. Select "OBO/OWL Tools" and "fetch ontology". Type the ontology ID space (e.g. MP) in the box.
# Select "OBO/OWL Tools > obo2owl"
# Execute
 
=== Filtering Relationship Types ===
 
Relationship types may need to be filtered to create a DAG that annotations can be propagated over
 
# Bring in the latest version of the ontology into your instance.
# Select "OBO/OWL Tools > filter relationship types"
# Click add "new relation"
# Write relation shorthand ID - e.g. part_of
# Repeat for other relations (note: is_a is always includes)
# Execute
 
=== Mapping identifiers ===
 
# Load in a set of identifiers. E.g UniProt ACCs
# Select "Annotation > ID map"
 
== Advanced Ontology Operations ==
 
=== Unfolding pre-composed terms ===
 
With the introduction of GAF v2, annotators can refine annotation terms by use of annotation extensions. E.g. A given biological process can have a [[occurs in]] extension to a cell type or a [[during]] extension to a cell cycle phase.
 
Many GO terms are pre-coordinated with equivalence axioms, meaning they can be expanded or ''unfolded'' to use a more basic term and a set of extensions
 
To unfold annotations
 
# Upload a GAF or use "Annotations > fetch annotations" to obtain annotations from a particular source.
# Select "Annotations advanced -> Unfold"
# Add your GAF
# Execute
 
The ontology is selected automatically. By default the minimal graph of all OBO ontologies reachable from any GO term is used.
 
== Source / Local installations ==


We maintain the GO/OBO galaxy tool configurations and helper scripts as a fork off of the main galaxy-dist repo in bitbucket.
We maintain the GO/OBO galaxy tool configurations and helper scripts as a fork off of the main galaxy-dist repo in bitbucket.
Line 15: Line 147:
* [https://bitbucket.org/cmungall/galaxy-obo/overview galaxy-obo]
* [https://bitbucket.org/cmungall/galaxy-obo/overview galaxy-obo]
* [https://bitbucket.org/galaxy/galaxy-dist galaxy-dist]
* [https://bitbucket.org/galaxy/galaxy-dist galaxy-dist]
We are currently exploring the best way to migrate this to toolsheds


=== Installing locally ===
=== Installing locally ===
Line 35: Line 169:


Erick Antezana, Aravind Venkatesan, C J Mungall, Vladimir Mironov, Martin Kuiper (2010) [http://www.biomedcentral.com/1471-2105/11/S12/S8 ONTO-ToolKit: enabling bio-ontology engineering via Galaxy], ''BMC Bioinformatics 11 (Suppl 12)''
Erick Antezana, Aravind Venkatesan, C J Mungall, Vladimir Mironov, Martin Kuiper (2010) [http://www.biomedcentral.com/1471-2105/11/S12/S8 ONTO-ToolKit: enabling bio-ontology engineering via Galaxy], ''BMC Bioinformatics 11 (Suppl 12)''
[[Category:Galaxy]]

Latest revision as of 17:03, 29 April 2015

Current URL: http://galaxy.geneontology.org

NOTE: At this time the GO Galaxy instance is for demo purposes only. Use at your own risk. The data here may not be kept fully up to date.

The GO website lists >50 tools for doing GO-based analyses but these are not well integrated. AmiGO/GOOSE offers some functionality such as slimming, enrichment, data extraction but these are difficult to chain together. A simple integrated environment in which tools could be linked in workflows would be beneficial to the community.

The GO Galaxy instance is an environment to do all your analyses in one place

  • term enrichment
  • format conversion (e.g. obo to owl)
  • annotation file munging (e.g. mapping to slim)

Using GO Galaxy

Sample Term Enrichment genesets

For testing you can use some existing sample sets:

  1. Open "Term Enrichment -> Get test geneset"
  2. Select sample set - e.g. mouse obesity

You may also want to fetch the corresponding GAF for this set

  1. Open "Annotation -> Fetch GAFs"
  2. Select the appropriate taxon ID for your set. E.g. for the mouse set, enter 10090

Basic Term Enrichment

  1. First upload your gene set. Select "Get data > upload file".
    • See above for sample testset
  2. Bring in the latest version of GO into your instance. Select "OBO/OWL Tools" and "fetch ontology". Type "GO" in the box.
    1. TIP: when the ontology comes in click on "attributes" (the pen symbol) and rename the ontology to "GO"
  3. Bring in the latest version of GO into your instance. Select "Annotation > Fetch associations"
    • The only field that needs to be filled in is the taxon ID (some common ones are listed at the bottom of the page)
    • This is going to be improved so that the species name (latin) or common name can be typed in
  4. Select "Term enrichment / GO termfinder"
    1. Enter the number of background genes in the genome, or an estimate
      • This is either the estimated number of genes in the genome, or for an array the number of genes on the chips
    2. Make sure your uploaded genes go into the "genes" selection
    3. Make sure GO and your GAF go into the respective fields
    4. Execute

the results here is a human-readable report file

Term Enrichment using Ontologizer

Follow the steps above, with the additional steps or modification

  1. Upload a population gene set.
    1. If you wish to consider all annotated genes, use "Text manipulation / cut" to select col3 from your GAF
  2. Select "Term enrichment / ontologizer"
    1. There are a variety of methods - consult the ontologizer site for details. try Parent-Child and Bonferoni as default
    2. Execute

Note that this produces two outputs - a tabular file that can be processed further and a visual display with enriched terms highlighted

Map2Slim

Mapping an annotation file to a slim (ontology subset) is a common operation

  1. Load an ontology (e.g. GO) and a GAF file - follow the steps above
  2. Select "Annotation > map2slim"
  3. Enter the name of your slim. This must be declared in the ontology file.
  4. Execute

The header of the processed GAF states that this is a mapped GAF, and tells you how many terms were mapped.

Note that the result is a valid GAF file and can be used anywhere a GAF can be used, including Term Enrichment, above

Map2Slim, custom, using AmiGO

This is similar to the above step, except instead of a pre-defined slim in the GO, we will generate our own set of IDs and use that

  1. Open "GOtools > AmiGO"
  2. Select the "Ontology" tab
  3. Open "source" and select "biological_process"
  4. Type "metabolic process" in the filter box to start your search
  5. Open isa/partof/regulates and click "+" next to "metabolic process" (should be ranked around 2nd)
  6. In "found entires" click "download to galaxy" (if not present, click to download to hard drive, save, and upload in galaxy)
  7. In galaxy, select "Annotation > map2slim, idlist"
    1. Select the file you uploaded
    2. Follow the steps for map2slim above

The result will be a GAF that is mapped to your terms

Map2Slim, custom, using a query

This is similar to the above step, we will use the GAF filter tool

  1. If you do not have a GAF in your workspace, get one using "Annotation -> fetch associations"
  2. Open "Annotation -> GAF filter"
  3. Enter a GO ID or term label (label must be exact). E.g. "metabolic process"
  4. Ensure your GAF is added
  5. Execute

The result will be a GAF that only includes annotations to the specified term or descendants.

Note that you can use this to filter by ontology - use a root term - e.g. "biological_process"

Variations:

To do a union query, you can run this multiple times and use "Annotation -> Merge GAFs"

Converting between OBO and OWL

  1. Bring in the latest version of the ontology into your instance. Select "OBO/OWL Tools" and "fetch ontology". Type the ontology ID space (e.g. MP) in the box.
  2. Select "OBO/OWL Tools > obo2owl"
  3. Execute

Filtering Relationship Types

Relationship types may need to be filtered to create a DAG that annotations can be propagated over

  1. Bring in the latest version of the ontology into your instance.
  2. Select "OBO/OWL Tools > filter relationship types"
  3. Click add "new relation"
  4. Write relation shorthand ID - e.g. part_of
  5. Repeat for other relations (note: is_a is always includes)
  6. Execute

Mapping identifiers

  1. Load in a set of identifiers. E.g UniProt ACCs
  2. Select "Annotation > ID map"

Advanced Ontology Operations

Unfolding pre-composed terms

With the introduction of GAF v2, annotators can refine annotation terms by use of annotation extensions. E.g. A given biological process can have a occurs in extension to a cell type or a during extension to a cell cycle phase.

Many GO terms are pre-coordinated with equivalence axioms, meaning they can be expanded or unfolded to use a more basic term and a set of extensions

To unfold annotations

  1. Upload a GAF or use "Annotations > fetch annotations" to obtain annotations from a particular source.
  2. Select "Annotations advanced -> Unfold"
  3. Add your GAF
  4. Execute

The ontology is selected automatically. By default the minimal graph of all OBO ontologies reachable from any GO term is used.

Source / Local installations

We maintain the GO/OBO galaxy tool configurations and helper scripts as a fork off of the main galaxy-dist repo in bitbucket.

We are currently exploring the best way to migrate this to toolsheds

Installing locally

1. install mercurial
2. Type:
  hg clone https://bitbucket.org/cmungall/galaxy-obo
3. Follow the instructions in README-galaxy-obo.txt
  more README-galaxy-obo.txt

Term Enrichment Format

See Is there a standard format for TE results?

  • TERF project on googlecode

Publications

Erick Antezana, Aravind Venkatesan, C J Mungall, Vladimir Mironov, Martin Kuiper (2010) ONTO-ToolKit: enabling bio-ontology engineering via Galaxy, BMC Bioinformatics 11 (Suppl 12)