Alignment of Multiple Sequences and Structures
MAFFTash is a server that calculates multiple sequence alignments from sequences and structures. It consists of two existing programs, MAFFT and ASH. ASH is a structural alignment program that utilizes an extension of the double dynamic programming algorithm to maximize the number of structurally equivalent residues between two proteins [1-3]. The pairwise structural alignments are then subjected to MAFFT, a widely-used multiple sequence alignment program [4-9]. MAFFT uses the structural alignments to construct an overall multiple alignment that is consistent with the pairwise structural alignments as much as possible. Sequence homologs with no structural information can also be included in the alignment.
To run MAFFTash you must provide a list of sequences and/or PDB and chain identifiers. The list may be pasted into the text area or uploaded from an external file. In either case, the sequences must be input in FASTA format, and the PDB and chain identifier must be joined as a string of length 5 (e.g. 1nagA). Each PDB and chain identifier line must be proceeded by a line containing the string >PDBID and nothing else. For example:
>PDBID 3ygsC >Q6Q899|DDX58_MOUSE| 1-91 MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQY LLKLQSEGWFQAFLDALYHAGYCGLCEAIES >Q6Q899|DDX58_MOUSE| 101-176 EEHRLLLRRLEPEFKATVDPNDILSELSECLINQECEEIRQIRDTKGRMAGAEKMAECLI RSDKENWPKVLQLALE >PDBID 2p1hA
is valid input. Note that chain identifiers are now mandatory for all PDB entries. Whitespaces (' '), dashes ('-'), and underbars ('_') are not acceptable chain identifiers. If you are uncertain about which chain IDs to use, please use the PDBj search engine. Type in your PDB ID, then click on 'sequence information (FASTA format)'. You will see the PDB sequence for each chain in FASTA format.
You are not limited to PDB entries and may provide your own PDB-formatted structures. To upload your own structures, click on the 'Add your own structures' checkbox and upload a pdb-formatted file. The Structure weight (default value .5) controls how much influence ASH has on the MAFFT alignment. Different values might need to be experimented with, depending on the ratio of structures to sequences.