AmiGO Manual: Installation 1.7

From GO Wiki
Revision as of 19:10, 24 July 2009 by Sjcarbon (talk | contribs)
Jump to navigation Jump to search

Warning: the config.pl files and variables in this document still reflect version 1.5.

Overview

This document is intended for AmiGO version 1.6 (the current public release).

AmiGO, at its heart, is a simple perl CGI script. However, behind the simple external interface lies a somewhat baroque system of file caches, databases connections, temporary files, and delegation. To set-up all of these things properly, AmiGO provides three scripts:

  • version.pl : a script to check the environment's perl libraries and make suggestions about what should still be installed. This script should probably run first if you have never tried to install this version of AmiGO before.
  • simulate.pl : similar to (and a likely replacement of) version.pl. Instead of recursively parsing files and seeing what the requirements look like, this script actually runs the various perl scripts in the perl interpreter and tries to catch what is necessary. The developers are switching over to this.
  • install.pl : an installation script that asks questions about the user's environment and tries to catch the biggest errors that one can make during installation.
  • refresh.pl : this final script creates all of the caches for the various subsystems. Required (as opposed to recommended, but technically optional) caches are also created during the installation process (install.pl). This script is useful for refreshing caches and taking some of the load off of the processing needed for Reference Genome subsystems (which wouldn't really be necessary for most installations of AmiGO).

All of the scripts will give options and help with the "perldoc <scriptname>" command.

The rest document is intended to help fill the gaps between the cryptic install scripts and what the developers know because they wrote it; no installation system is perfect.

As a concrete example, at the end is a HOWTO for the installation of AmiGO from a completely fresh Ubuntu 9.04 desktop system.

Download

The most recent stable version of AmiGO should be available as part of the go-dev repository on the GO CVS site at SourceForge.net: http://sourceforge.net/projects/geneontology

The sourceforge CVS repository can be found at: geneontology.cvs.sourceforge.net:/cvsroot/geneontology go-dev (this is a change from a couple years ago). I do not think that you need any subdirectories of go-dev other than go-perl, go-db-perl, and amigo.

You also need a GO database you can connect to via DBD::mysql (which you might already have if you have an old amigo installation).

Requirements

GO database

There are numerous ways of loading GO into a MySQL database for AmiGO to use. Below, two of the most common will be covered. For more detailed information, please see the main GO database pages.

A lot more is outside the scope of this document. For more information about installing, see the online documentation. For permission issues with connecting to your GO database, check with your friendly system admin.

Loading by script

Probably the easiest way of getting a GO database to work with is using a perl script that is provided in the go-dev distribution (see above): go-dev/go-db-perl/scripts/go_db_install.pl. Usage and examples are given by:

go-dev/go-db-perl/scripts/go_db_install.pl -h

For example, the following incantation will load the latest lite database dump into a database called go_latest_lite on localhost:

go-dev/go-db-perl/scripts/go_db_install.pl -i -e go_latest_lite -v -d localhost

The following example loads the latest database dump into a database called go_latest onto localhost:

go-dev/go-db-perl/scripts/go_db_install.pl -v -d localhost

This method is also very easy to put into a crontab.

Direct manual loading

The following instructions can be used to create a GO database for AmiGO to use.

  • Download a database dump from http://archive.geneontology.org/ ; make sure that the file name ends with "-data.gz". In this example, we'll call this file go_200XXXXX-seqdblite-data.gz.
  • Unzip the database dump file.
  • Using your favorite MySQL client, create a database. In this example we'll call it go_2000XXXXX. Using the default MySQL client, the command would be :
CREATE DATABASE go_200XXXXX;
  • From the command line, load the database dump file into the database:
mysql go_200XXXXX < go_200XXXXX-seqdblite-data
  • Done!

Loading Annotations

While the usual GO database dumps found at http://archive.geneontology.org include many useful annotations, users may also load their own annotations into their local GO databases. To accomplish this, the go-dev distribution comes with many different scripts to manage association files. The easiest to use is probably go-dev/go-db-perl/scripts/load-go-into-db.pl .

The following incantation would load a gene association (ga_file.gz) file into the my_go_db database on localhost:

GO_ROOT=/path_to_go-dev/go-dev perl ./load-go-into-db.pl -d my_go_db -h localhost -datatype go_assoc -fill_count ga_file.gz

If the user does not have the perl DBIx::Stag, go-perl, or go-db-perl modules loaded and/or in their path, the same incantation would look like:

 GO_ROOT=/path_to_go-dev/go-dev perl -I /path_to_go-dev/go-dev/go-db-perl -I /path_to_go-dev/go-dev/go-perl -I /path_to_dbixstag/DBIx-DBStag-0.09 ./load-go-into-db.pl -d my_go_db -h localhost -datatype go_assoc -fill_count ga_file.gz

Web server

AmiGO does not provide its own web server, it is currently developed and run on Apache. However, there are some experimental components that run their own web server, but it is unlikely this will concern most people trying to install AmiGO.

For what it's worth, AmiGO is currently developed on various sub-versions of Apache 2.2 and has its main public deployment on Apache 2.0.

BLAST

Also, if you are interested having the AmiGO wrapper for BLAST, you will need to download and install WU BLAST. You will also need a FASTA file from the Stanford GO archive.

GraphViz

Currently, AmiGO also depends on having GraphViz in its execution path--specifically the "dot" program.

Perl

As far as the perl environment goes, major packages that are necessary are: CGI::Application, GO::TermFinder, Template, CGI, DBI, and DBD::mysql, GraphViz, bioperl, go-perl, and go-db-perl (the last two included with the go-dev repository and are sometimes treated differently because of this--AmiGO is usually run as part of the complete go-dev repository). The vast majority of these should be available in your distribution. Otherwise, you will have to install them manually through CPAN.

version.pl

There is a script in go-dev/amigo called version.pl that can be used to get a more detailed opinion of what the developers feel is necessary to run AmiGO (there are many old, uneccessary, and experimental libraries that might confuse things). The script must be run from the go-dev/amigo directory.

simulate.pl

As mentioned above, simulate.pl may be the new version.pl. We're still trying it out, but so far so good.

install.pl

Once you have the software and requirements met, you just go to the amigo directory and type install.pl (the "-h" flag you will get usage details). Depending on your environment, you may want to first set the following ENV variables:

setenv GO_ROOT <path_to_go-dev_source_dir>
setenv PATH /tools/perl/5.8.8/bin:${PATH}:${GO_ROOT}/go-perl/scripts
setenv PERLLIB $GO_ROOT/go-perl:$GO_ROOT/go-db-perl:$GO_ROOT/amigo/perl

When you run install.pl it asks you around a couple dozen questions about the installation configuration. The questions may be a little cryptic (the "-v" flag can be helpful here). A "config.pl" file will be created in the same directory as installer.pl during a successful installation run and is the basis for all future installation attempts after the first.

The "-r" option will overwrite config.pl if it exists and use its contents as defaults for a new round of interactive questioning.

The "-i" option will ignore config.pl if it exists and use the internal variables as the defaults for a new round of interactive questioning.

The "-f <filname>" option will read in <filename>, write a new config.pl, and continue installation as normal. This is useful if you have multiple AmiGO configurations that you're trying to juggle.

There is also the ability to change config.pl manually and rerunning installer.pl with no arguments at all--by default, it will use whatever is in config.pl. This is how I generally do things. (Also, I typically use this and copy different known good configurations elsewhere for use with the "-f" option.

Using the "-v" option in conjunction with any of the other options may give you a better idea of what variables are being targeted and what files are read.

Google Analytics

To enable Google Analyics in AmiGO, create a file called .analytics.json in the AmiGO cgi-bin directory and enter your id as below:

{"google_analytics_id":"UA-XXXXXXX-X"}

Example config.pl files

While you can construct a config.pl file from scratch, it is highly recommended that you let the script create the file the first time, and then you may modify the values manually and rerun the script.

Be prepared to accept defaults or have an answer for these. Obviously, the path names are the most important.

Example file #1

These are the values used by one of the developer's at Berkeley.

config.pl

$ENV{GO_ROOT}='/home/sjcarbon/local/src/cvs/go-dev';
$ENV{GO_DBNAME}='go_new';
$ENV{GO_DBHOST}='localhost';
$ENV{GO_DBUSER}=;
$ENV{GO_DBAUTH}=;
$ENV{GO_DBPORT}=;
$ENV{GO_DBSOCKET}=;
$ENV{GO_HAS_COUNT_BY_SPECIES}='1';
$ENV{AMIGO_PROJECT_NAME}='amigo';
$ENV{AMIGO_HTDOCS_PARTIAL_PATH}='/srv/www/htdocs';
$ENV{AMIGO_HTDOCS_PARTIAL_URL}='http://localhost';
$ENV{AMIGO_CGI_PARTIAL_PATH}='/srv/www/cgi-bin';
$ENV{AMIGO_CGI_PARTIAL_URL}='http://localhost/cgi-bin';
$ENV{AMIGO_SHOW_GP_OPTIONS}='1';
$ENV{AMIGO_SHOW_GRAPHVIZ}='1';
$ENV{AMIGO_DOT_PATH}='/usr/bin/dot';
$ENV{AMIGO_SHOW_BLAST}='1';
$ENV{AMIGO_FASTA_DB}='/www/toy_9002/cgi-bin/data/go_20071106-seqdblite.fasta';
$ENV{AMIGO_BLASTP}='/share/bdgp64/wublast/blastp';
$ENV{AMIGO_BLASTX}='/share/bdgp64/wublast/blastx';
$ENV{AMIGO_BLAST_METHOD}='cgi';
$ENV{AMIGO_QSUB}='/usr/local/command';
$ENV{AMIGO_QUEUE}='/usr/local/queue';
$ENV{AMIGO_PBS_USER}='nobody';
$ENV{AMIGO_MAX_SEQ_NUM}='100';
$ENV{AMIGO_MAX_SEQ_LENGTH}='3000000';
$ENV{AMIGO_SHOW_GOOSE_LINKS}='1';
$ENV{AMIGO_USE_DEFAULT_AMIGO_FILTERS}='1';
$ENV{AMIGO_SHOW_ONT_FILTER}='1';
$ENV{AMIGO_SHOW_TAXID_FILTER}='1';
$ENV{AMIGO_SHOW_SPECIESDB_FILTER}='1';
$ENV{AMIGO_SHOW_EVCODE_FILTER}='1';
$ENV{AMIGO_SHOW_GPTYPE_FILTER}='1';
$ENV{AMIGO_SHOW_ASSBY_FILTER}='0';
$ENV{AMIGO_SHOW_QUAL_FILTER}='0';
$ENV{AMIGO_TEMPLATE_PATHS}='templates/pages:templates/includes';
$ENV{AMIGO_SESSION_DIR}='sessions';
$ENV{AMIGO_MAX_SESSIONS}='200';
$ENV{AMIGO_SESSION_TIMEOUT}='7200';
$ENV{AMIGO_PAGE_SIZE}='50';
$ENV{AMIGO_MAX_RESULTS_HTML}='200';
$ENV{AMIGO_MAX_RESULTS_DOWNLOAD}='1000';
$ENV{AMIGO_CALCULATE_GP_COUNTS}='0';
$ENV{AMIGO_CALCULATE_TERM_COUNTS}='0';
$ENV{AMIGO_GET_RELEVANCE}='1';
$ENV{AMIGO_CLEVER_MODE}='1';
$ENV{AMIGO_OBSOLETE_BEHAVIOUR}='include_commented';
$ENV{AMIGO_TERM2TERM_METADATA_LOADED}='0';
$ENV{AMIGO_TERM_REGEXP}='GO\:[0-9]{7}';
$ENV{AMIGO_VERBOSE}='1';
$ENV{AMIGO_BETA}='1';


Variable meanings

Below is a list of meanings for some of the more important AmiGO variables that might not self-evident.

  • GO_ROOT : The location of the local go-dev repository.
  • GO_DBNAME
  • GO_DBHOST
  • GO_DBUSER
  • GO_DBAUTH
  • GO_DBSOCKET
  • GO_HAS_COUNT_BY_SPECIES : Whether or not this was done during the GO db installation process.
  • AMIGO_PROJECT_NAME : This will be added to the end of the next four variables to copy files to the proper location and generate URLs. Useful to change if you want multiple AMiGO installations on the same web server.
  • AMIGO_HTDOCS_PARTIAL_PATH : The path to the root htdocs directory (in Apache terminology).
  • AMIGO_HTDOCS_PARTIAL_URL : The URL the above resolves to.
  • AMIGO_CGI_PARTIAL_PATH : The path to the root cgi-bin directory
  • AMIGO_CGI_PARTIAL_URL : The URL the above resolves to.
  • AMIGO_SHOW_GP_OPTIONS
  • AMIGO_SHOW_GRAPHVIZ
  • AMIGO_DOT_PATH : Location of the dot binary
  • AMIGO_SHOW_BLAST
  • AMIGO_FASTA_DB : The location of the downloaded FASTA file.
  • AMIGO_BLASTP
  • AMIGO_BLASTX
  • AMIGO_BLAST_METHOD
  • AMIGO_QSUB
  • AMIGO_QUEUE
  • AMIGO_PBS_USER
  • AMIGO_MAX_SEQ_NUM
  • AMIGO_MAX_SEQ_LENGTH
  • AMIGO_SHOW_GOOSE_LINKS
  • AMIGO_USE_DEFAULT_AMIGO_FILTERS
  • AMIGO_SHOW_ONT_FILTER
  • AMIGO_SHOW_TAXID_FILTER
  • AMIGO_SHOW_SPECIESDB_FILTER
  • AMIGO_SHOW_EVCODE_FILTER
  • AMIGO_SHOW_GPTYPE_FILTER
  • AMIGO_SHOW_ASSBY_FILTER
  • AMIGO_SHOW_QUAL_FILTER
  • AMIGO_TEMPLATE_PATHS
  • AMIGO_SESSION_DIR
  • AMIGO_MAX_SESSIONS
  • AMIGO_SESSION_TIMEOUT
  • AMIGO_PAGE_SIZE
  • AMIGO_MAX_RESULTS_HTML
  • AMIGO_MAX_RESULTS_DOWNLOAD
  • AMIGO_CALCULATE_GP_COUNTS
  • AMIGO_CALCULATE_TERM_COUNTS
  • AMIGO_GET_RELEVANCE
  • AMIGO_CLEVER_MODE
  • AMIGO_OBSOLETE_BEHAVIOUR
  • AMIGO_TERM2TERM_METADATA_LOADED
  • AMIGO_TERM_REGEXP
  • AMIGO_VERBOSE : Whether or not to output tons of debugging information. Useful for debugging.
  • AMIGO_BETA : Whether of not to make the experimental version of AmiGO visible (used with the install.pl -e option).

Example Installation

This assumes that you have already installed the GO database (see above).

Ubuntu 9.04 server

Running simulate.pl

...

Getting the proper libraries

sudo apt-get -u install libcgi-application-perl libcgi-application-plugins-perl libgraphviz-perl graphviz libdbix-class-perl
sudo apt-get -u install bioperl libfreezethaw-perl libgraph-perl
sudo apt-get -u install libxml-xpath-perl libxml-writer-perl libxml-libxml-perl libxml-libxslt-perl
sudo apt-get -u install libtest-www-mechanize-cgiapp-perl g++
sudo apt-get -u install dh-make-perl
dh-make-perl --build --cpan GO::TermFinder
sudo dpkg -i libgo-termfinder-perl_0.7-1_amd64.deb 
cd go-dev/amigo
./install.pl -v

Contacts

If you are still having problems installing the AmiGO software, you can contact the developers directly :

  • Seth at LBNL (sjcarbon) (berkeleybop dot org)
  • Amelia at EBI (aji) (ebi dot ac dot uk)

Good Luck

Good luck!