Phylogenetic tree

Multiple Sequence Alignment

  • Alignment of sequences means to re-arrange original sequences such that the location of similar bases match best to each other (by adding space / inserting gaps).

  • Aligned sequences are required, for example, for constructing phylogenetic (evolutionary) trees.

  • There exits several tools for sequence alignment including MAFFT and MUSCLE. Original sequences have to be provided as multi-fasta file with all sequences in identical direction (corrected for forward and reverse strand).

  • For short read alignment (shotgun sequencing), see bowtie2 or BWA

  • Check final alignment results, using → Alignment viewer


Download and Install

sudo apt-get update

sudo apt-get install mafft

Run MAFFT to align different variants of a DNA (gene) sequence

mafft --localpair --adjustdirectionaccurately --maxiterate 1000 sequences.fasta > aligned_sequences.aln

--adjustdirectionaccurately to correct sequence direction (forward or reverse)

--maxiterate 1000 for less than 200 sequences

--globalpair (instaed of --localpair) for sequences of similar lengths

MAFFT online web-tool


MUltiple Sequence Comparison by Log-Expectation

Download and install on Ubuntu

sudo apt install muscle

Run muscle

muscle -in sequences.fasta -out aligned_sequences.aln

Muscle online web-tool

Clustal Omega

protein sequence alignment tool

Install on Ubuntu

sudo apt-get update

sudo apt-get install clustalo

Example (command line)

clustalo -i sequences.fasta -o aligned_sequences.aln --auto -v

--auto set alignment options automatically

-v print "verbose" progress info

# see more options

clustalo --help

Clustal Omega online web-tool (max 4000 sequences)


See also:

→ EMBL-EBI list of Multiple Sequence Alignment tools

→ Alignment viewer