Sequence-Alignment

Multiple Sequence Alignment

Align sequences given in a multi-fasta file

Before constructing phylogenetic (evolutionary) trees, sequences need to re-arranged to match best to each other, for example, by inserting gaps. There exits several tools for sequence alignment including MAFFT and MUSCLE.

If you have more than 200 sequences, try → PASTA or → UPP. If your sequences are fragmented, try → UPP.

For short read alignment, see → bowtie2 or → BWA

MAFFT

Download and Install

sudo apt-get update

sudo apt-get install mafft

http://mafft.cbrc.jp/alignment/software/

Run mafft

mafft --localpair --maxiterate 1000 sequences.fasta > aligned_sequences.aln

http://mafft.cbrc.jp/alignment/software/algorithms/algorithms.html

--maxiterate 1000 for less than 200 sequences

--globalpair (instaed of --localpair) for sequences of similar lengths

MAFFT online web-tool

http://mafft.cbrc.jp/alignment/server/

MUSCLE

MUltiple Sequence Comparison by Log-Expectation

http://www.drive5.com/muscle/

Download and install on Ubuntu

wget http://www.drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86linux64.tar.gz

tar -zxvf muscle3.8.31_i86linux64.tar.gz

mv muscle3.8.31_i86linux64 muscle

mv muscle tools/ # move to your tools/ directory

http://www.drive5.com/muscle/downloads.htm

Run muscle

muscle -in sequences.fasta -out aligned_sequences.aln

Muscle online web-tool

http://www.ebi.ac.uk/Tools/msa/muscle/

Clustal Omega

protein sequence alignment tool

http://www.clustal.org/omega/

Install on Ubuntu

sudo apt-get update

sudo apt-get install clustalo

Example (command line)

clustalo -i sequences.fasta -o aligned_sequences.fasta --auto -v

--auto set alignment options automatically

-v print "verbose" progress info

# see more options

clustalo --help

Clustal Omega online web-tool

http://www.ebi.ac.uk/Tools/msa/clustalo/ (max 4000 sequences)

http://www.genome.jp/tools/clustalw/

Documentation

http://www.clustal.org/omega/README

See also:

→ EMBL-EBI list of Multiple Sequence Alignment tools

→ Alignment viewer