Length of an exact sequence match, as start region for the final alignment blastn -query genes.ffn -subject genome.fna -word_size 11 A BLAST search starts with finding a perfect sequence match of length given by -word_size. This initial region of an exact sequence match is then extended in both direction allowing gaps and substitutions based on the scoring thresholds. Changing the initial word-size can help to find more, but less accurate hits; or to limit the results to almost perfect hits.
For short sequences, word-size must be less than half the query length, otherwise reliable hits can be missed. nucleotide sequence search blastn only (bastn -task blastn): -word_size 11 amino acid search (blastp): -word_size 3 → BLAST command-line options Setting the word-size to a very low value ( -word_size 5 ) makes a blastn search very slow. |