Sequence-Alignment

Phylogenetic tree

Multiple Sequence Alignment

  • Alignment of sequences means to re-arrange original sequences such that the location of similar bases match best to each other (by adding space / inserting gaps).

  • Aligned sequences are required, for example, for constructing phylogenetic (evolutionary) trees.

  • There exits several tools for sequence alignment including MAFFT and MUSCLE. Original sequences have to be provided as multi-fasta file with all sequences in identical direction (corrected for forward and reverse strand).

  • For short read alignment (shotgun sequencing), see bowtie2 or BWA

  • Check final alignment results, using → Alignment viewer

MAFFT

Download and Install

sudo apt-get update

sudo apt-get install mafft

http://mafft.cbrc.jp/alignment/software/

Run MAFFT to align different variants of a DNA (gene) sequence

mafft --localpair --adjustdirectionaccurately --maxiterate 1000 sequences.fasta > aligned_sequences.aln

--adjustdirectionaccurately to correct sequence direction (forward or reverse)

--maxiterate 1000 for less than 200 sequences

--globalpair (instaed of --localpair) for sequences of similar lengths

https://mafft.cbrc.jp/alignment/software/algorithms/algorithms.html

https://mafft.cbrc.jp/alignment/software/adjustdirection.html

MAFFT online web-tool

http://mafft.cbrc.jp/alignment/server/

MUSCLE

MUltiple Sequence Comparison by Log-Expectation

http://www.drive5.com/muscle/

Download and install on Ubuntu

sudo apt install muscle

http://www.drive5.com/muscle/downloads.htm

Run muscle

muscle -in sequences.fasta -out aligned_sequences.aln

Muscle online web-tool

http://www.ebi.ac.uk/Tools/msa/muscle/

Clustal Omega

protein sequence alignment tool

http://www.clustal.org/omega/

Install on Ubuntu

sudo apt-get update

sudo apt-get install clustalo

Example (command line)

clustalo -i sequences.fasta -o aligned_sequences.aln --auto -v

--auto set alignment options automatically

-v print "verbose" progress info

# see more options

clustalo --help

Clustal Omega online web-tool

http://www.ebi.ac.uk/Tools/msa/clustalo/ (max 4000 sequences)

http://www.genome.jp/tools/clustalw/

Documentation

http://www.clustal.org/omega/README

See also:

→ EMBL-EBI list of Multiple Sequence Alignment tools

→ Alignment viewer