Interfacing Swiss-Prot

 

1)      Download sprot40.dat (or the newest release) from the ExPASy server.

2)      Run the buildindex.pl script to build a protein.dat file

3)      Run the extract.pl script to extract the sequences of proteins matching a desired regular expression into directory
usage: ./extract.pl regexp (directory)

4)      Calculate distances from sequence files matching fileglob: ./distance.pl fileglob

 

Warning: typing something like ./extract * can dispose of disk quotas rather quickly