Sequence alignment plays a significant role in many bioinformatics
problems, primarily as a means of detecting and quantifying
evolutionary similarity. In the special case where an amino acid
sequence is aligned to a protein structure, we often use the term
threading, which suggests pulling a flexible string through the
backbone of a 3-dimensional template.
Most programs currently treat the query sequence, the template, or
both, as a position specific scoring matrix (PSSM). A PSSM can be
computed from a multiple sequence alignment (MSA). However, since
there is a loss of information in the conversion of an MSA to a PSSM
(many different MSAs will yield the same PSSM), MSThread treats the
query MSA directly.
The query MSA is prepared using the program MAFFT.
Each template MSA is pre-computed using the program MAFFTash.
A manuscript describing the MSThread method is currently in
preparation (A. Han, et al.), but the source code for the core
algorithm can be downloaded here.