InterProScan is a piece of software which scans a range of protein
signatures against your sequence. These signatures mainly
represent proteins belonging to the same family, functional domains, or
active sites and so can be used to extrapolate the potential function
of your protein. Many of the protein signatures have been
integrated into the <a href="http://www.ebi.ac.uk/InterPro/">InterPro database </a>
and therefore have GO terms associated with them. In this way,
you can quickly use InterProScan to associate functional information
and GO terms to your protein of interest.
Tutorial on <a href="http://www.ebi.ac.uk/2can/tutorials/function/InterProScan.html">InterProScan</a>
<a href="http://www.ebi.ac.uk/2can/tutorials/function/InterProScan2.html"> Schematic Diagram</a> showing what happens to sequences entered into InterProScan.
Where to find InterProScan
InterProScan can be accessed in 3 different ways.
1) Via <a href="http://www.ebi.ac.uk/InterProScan/">EBI website</a>
2) Via <a href="http://www.ebi.ac.uk/Tools/webservices/WSInterProScan.html">EBI web services</a>
3) By <a href="ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/index.html">downloading</a> the program to your local servers and running it.
If you only have a small number of sequences to characterize, it's probably easiest to run them via the EBI website - you will get your results in a graphical, tabular and XML format; all of which contain links to InterPro and GO. Currently, because the service is so popular, we have unfortunately had to restrict the number of sequences you can submit at one time to 1.
Alternatively, if you know how to program with Perl, you can use EBI's InterProScan web service to submit multiple searches in parallel (up to 20 sequences at once). All you need to do is install the appropriate client from the page listed in point 2) above and you can submit your sequences. Again, you can get your results back in a variety of formats, all with links to GO terms and InterPro.
Most people download the stand-alone version of the program if they have large numbers of sequences to characterize, as there is no real limit to the number of sequences you can search at any one time. However, the program is very computationally expensive and should only be installed if you are sure that your hardware set up is sufficient to cope with the demands of the software. The stand-alone version can either be run through the unix command-line or a web interface can be installed.
Both the EBI-based services and the stand-alone version come with extensive documentation to help you use them.
Protein or Nucleotide Sequence
In all three cases, you can start with either protein or nucleotide
sequences (if you start with nucleotide sequence, it can be translated
into multiple frames and the ORFs characterized instead). You can
choose from a variety of output formats to display the results.