overS
NAME
overS - find overlaps between two sequences.
SYNOPSIS
overS db file2 match mismatch alpha beta [ flags ]
DESCRIPTION
overS finds the best similarity alignment of the overlap
between the sequences in db and the sequence in file2, given
a particular set of scoring parameters.
The scoring parameters are all integer values, and all posi-
tive. The parameters mismatch, alpha and beta are sub-
tracted from the score; match is added to the score.
match the score for aligning identical letters.
mismatch the amount to subtract for a mismatch.
alpha the amount to subtract for the first letter of an
insertion or deletion sequence (indel).
beta is the amount to subtract for subsequent letters
in an indel. For example, if there is a five-
letter indel, k = 5, then alpha + beta * ( k - 1 )
= alpha + beta * (4) will be subtracted from the
score.
flags See manual page seqaln-intro (1) for a full
description of optional flags.
Some flags of particular use with the overlap software are:
-1 Don't report best alignment ending at the end of
the first sequence.
-2 Don't report best alignment ending at the end of
the second sequence.
+1 Report best alignment ending at the end of the
first sequence.
+2 Report best alignment ending at the end of the
second sequence.
+3 Report the highest score ending at both the first
and second sequences.
The format of sequence files db and file2 is our standard
format, the Pearson/FASTA format. The first line is the
sequence name, and should be used as a description.
Subsequent lines contain the sequence to be used. The
sequences themselves may contain blanks, returns, and other
whitespace for readability. The sequence terminates at
end-of-file, `>' is read to begin a new sequence in the
FASTA format. Only multiple sequences in the first file
will be processed.
REFERENCES
M.S. Waterman. Introduction to Computational Biology: Maps,
sequences and genomes. Chapman & Hall. London: 1995. ISBN
0-412-99391-0.
SEE ALSO
seqaln-intro(1), moverS(1), sequence-file(5).