Generate database

BLAST software tool

Create a blastn species database

For faster mapping of many sequences against the same reference database

Example: create E. coli genome DB

# merge all E. coli reference genomes into one ecoli.fasta file

cat genomes/ecoli_strains/*.fna > ecoli.fasta

# create species genome database for Blast

makeblastdb -in ecoli.fasta -parse_seqids -dbtype nucl

-parse_seqids is required to identify gene/contig IDs in mapping hits

-dbtype nucl specifies the type of sequences: protein 'prot' or nucleotide 'nucl'

# run Blast using the created species DB (database option: -db)

blastn -query genes.fasta -db ecoli.fasta -outfmt 6 -evalue 1e-30

> Public sequences