AmiGO help GOst

From GO Wiki
Jump to: navigation, search

GOst: The Gene Ontology BLAST server

BLAST is a search algorithm designed to find sequence similarities (Altschul et al., 1990). An on-line guide to BLAST searching can be found at the NCBI: BLAST Help Manual [1]. The Gene Ontology BLAST server is WU-BLAST; technical information may be found at http://blast.wustl.edu/.


GOst, the Gene Ontology BLAST server, searches the sequences annotated with GO terms from the GO protein sequence database. Protein queries are searched using BLASTP, while nucleotide sequences are searched using BLASTX. There is no need to specify which program to use, but all query sequences must be of the same type.


Using GOst

- Click on the link "Query by sequence". The Gene Ontology Blast server page opens.

Here, there are three options to submit a query: 1. Paste a UniProt accession ID, for example P55269. 2. You can also paste up to 100 sequences in the "Paste in a FASTA sequence(s)" window. The total number of residues cannot exceed 3 million. If you paste more than one sequence, each sequence will return a separate results page. You can browse through the results by clicking on the page numbers at the top of the page, or view all the results on a single page by clicking on 'All'. 3. It is possible to upload a file containing sequences in FASTA format. The same limits apply: up to 100 sequences and 3 million residues.


BLAST parameters

- Expect threshold: This is the maximum expect value (E value) required for a hit to be returned. The E value is the statistical significance threshold for reporting matches against database sequences. If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported (Source: http://www.ncbi.nlm.nih.gov/blast/blast_help.shtml).


- Number of best hits: Select the number of target sequences to display in the results. Choosing fewer sequences produces results faster.

- BLAST filter: Filtering is ON by default and filters the query sequence for low complexity regions. In a protein search low complexity regions appear as X's in the alignment while in a nucleotide search they appear as N's. The score and E-value of a match may be affected by filtering since it effectively shortens the query length.