HP UX Archive Centre

September 15, 1991

Source code for the Basic Local Alignment Search Tool (BLAST) family
of sequence database comparison programs, along with some support utilities
and (new) awk scripts, is posted here.  A previous major distribution is
archived in its entirety beneath the "pub/blast.old" directory.

Additional source code is necessary to compile and link the BLAST programs:
the pre-release "ncbi", "gish", and "dfa" libraries.  Source code for these
libraries is located at the same level in the directory hierarchy as the
"blast" distribution (currently in /pub/ncbi, /pub/gish, and /pub/dfa).

The file blast.tar.Z is an L-Z compressed UNIX(R) tar archive containing
all of the files splayed beneath the "explode" subdirectory.  FTP this
file to your local machine in binary mode, uncompress it, then untar it.
The blast.tar file is the same file, just not compressed.  VMS compress
and tar utilities are posted on this machine in the toolbox/vms_util directory.


For installation instructions, see the INSTALL file.

Send bug reports or requests for electronic mail distribution to:

  Dr. Warren Gish, gish@ncbi.nlm.nih.gov
		or
  Dr. Stephen Altschul, altschul@ncbi.nlm.nih.gov

  National Center for Biotechnology Information
  National Library of Medicine
  Bldg. 38A Rm 8N-806
  8600 Rockville Pike
  Bethesda, MD 20894-0001
  (301) 496-2475


The people who played a role in bringing this fine software to you:

  Samuel Karlin, Dept. of Mathematics, Stanford Univ., Stanford, CA 94305
  Stephen Altschul, NCBI, NLM, Bethesda, MD 20894
  Webb Miller, Dept. of CS, Penn. State Univ., University Park, PA 16802
  Gene Myers, Dept. of CS, Univ. of Arizona, Tuscon, AZ 85721
  Warren Gish, NCBI, NLM, Bethesda, MD 20894
  David Lipman, NCBI, NLM, Bethesda, MD 20894


Brief descriptions of the programs:

blastp:  compare an amino acid query sequence against a protein sequence
database.

blastn:  compare a nucleotide query sequence against a nucleotide sequence
database.

blastx:  compare a nucleotide query sequence translated in all 6 reading
frames (3 on each strand) against a protein sequence database.

tblastn:  compare an amino acid query sequence against a nucleotide sequence
database translated in all 6 reading frames.

blast3:  compare an amino acid query sequence against a protein sequence
database to identify statistically significant 3-way sequence alignments
(the query sequence plus two database sequences) in which the component
pairwise alignments are statistically insignificant.

setdb:  produce a protein sequence database for use by blastp, blastx,
and blast3 from a multi-sequence file in FASTA format.

pressdb:  produce a nucleotide sequence database for use by blastn and
tblastn from a multi-sequence file in FASTA format.

pam:  generate a PAM matrix of any desired distance (from 2 to 511) and scale.

pir2fasta:  produce a file in FASTA format from one in NBRF PIR(R) format.

gb2fasta:  produce a file in FASTA format from one in GenBank(R) format.

sp2fasta:  produce a file in FASTA format from one in SWISS-PROT(R) format.

memfile:  manage the loading, updating, and dropping of files mapped into
shared memory segments.


Other files:

    blast.1:  UNIX style manual page (using nroff's -man macros) describing
    blastp, blastn, blastx, and tblastn.

    blast.1ps:  PostScript(R) version of blast.1

    blast3.1:  UNIX style manual page describing blast3.

    blast3.1ps:  PostScript version of blast3.1

	pir2fasta.nawk:  a new awk script for converting NBRF PIR files into FASTA

	gb2fasta.nawk:  a new awk script for converting GenBank files into FASTA

	dxch.aa:  a sample protein sequence from the PIR in FASTA format

Modification history:
    this information has been moved into the file named HISTORY.