profile



NAME

     profile - format of the profile description file.


DESCRIPTION

     Profiles permit the comparison of a profile matrix against a
     sequence  or series of sequences.  The profile matrix speci-
     fies a different score for  each  letter  (column)  in  each
     position  in the alignment (row).  This differs from scoring
     matrices, which assign a single score for a two-letter match
     independently of the match's position in the alignment.

     The first line can contain a descriptive name  of  the  pro-
     file.

     The second line contains the length of  the  sequence  (that
     is,  the number of rows in the matrix, as well as the number
     of remaining lines in the  file)  and  the  alpha  and  beta
     values  for indel scores; for a further explanation of these
     two numbers, see seqaln-intro(1).

     Each line of the sequence is a set of 26 scores, correspond-
     ing to each letter of the alphabet.  This general format can
     be used to search for protein,  DNA  and  RNA  sequences  by
     entering scores in the appropriate columns.



EXAMPLES

     The following is a sample profile for DNA sequence  matching
     to locate either ACGT or ACGA, with equal preference.  Typi-
     cally the indel values alpha and beta are much  larger  than
     the  largest score, to coerce matching for a profile; in the
     example below, alpha is 200 and beta is 100.

          DNA profile
          4 200 100
          10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
          10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0


SEE ALSO

     seqaln-intro(1), fitS(1), mfit
S(1), fitD(1), <
A HREF="mfitD.1.html">mfitD(1),  pro-
     file(5), sequence-file(5).