profile
NAME
profile - format of the profile description file.
DESCRIPTION
Profiles permit the comparison of a profile matrix against a
sequence or series of sequences. The profile matrix speci-
fies a different score for each letter (column) in each
position in the alignment (row). This differs from scoring
matrices, which assign a single score for a two-letter match
independently of the match's position in the alignment.
The first line can contain a descriptive name of the pro-
file.
The second line contains the length of the sequence (that
is, the number of rows in the matrix, as well as the number
of remaining lines in the file) and the alpha and beta
values for indel scores; for a further explanation of these
two numbers, see seqaln-intro(1).
Each line of the sequence is a set of 26 scores, correspond-
ing to each letter of the alphabet. This general format can
be used to search for protein, DNA and RNA sequences by
entering scores in the appropriate columns.
EXAMPLES
The following is a sample profile for DNA sequence matching
to locate either ACGT or ACGA, with equal preference. Typi-
cally the indel values alpha and beta are much larger than
the largest score, to coerce matching for a profile; in the
example below, alpha is 200 and beta is 100.
DNA profile
4 200 100
10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0
SEE ALSO
seqaln-intro(1), fitS(1), mfit
S(1), fitD(1), <
A HREF="mfitD.1.html">mfitD(1), pro-
file(5), sequence-file(5).