E-value & Bit-score

BLAST software tool


The BLAST E-value is the number of expected hits of similar quality (score) that could be found just by chance.

E-value of 10 means that up to 10 hits can be expected to be found just by chance, given the same size of a random database.

E-value can be used as a first quality filter for the BLAST search result, to obtain only results equal to or better than the number given by the -evalue option. Blast results are sorted by E-value by default (best hit in first line).

blastn -query genes.ffn -subject genome.fna -evalue 1e-10

The smaller the E-value, the better the match.

-evalue 1e-50

small E-value: low number of hits, but of high quality

Blast hits with an E-value smaller than 1e-50 includes database matches of very high quality.

-evalue 0.01

Blast hits with E-value smaller than 0.01 can still be considered as good hit for homology matches.

-evalue 10 (default)

large E-value: many hits, partly of low quality

E-value smaller than 10 will include hits that cannot be considered as significant, but may give an idea of potential relations.

The E-value (expectation value) is a corrected bit-score adjusted to the sequence database size. The E-value therefore depends on the size of the used sequence database. Since large databases increase the chance of false positive hits, the E-value corrects for the higher chance. It's a correction for multiple comparisons. This means that a sequence hit would get a better E-value when present in a smaller database.

E = m x n / 2bit-score

m - query sequence length

n - total database length (sum of all sequences)


The higher the bit-score, the better the sequence similarity

The bit-score is the requires size of a sequence database in which the current match could be found just by chance. The bit-score is a log2 scaled and normalized raw-score. Each increase by one doubles the required database size (2bit-score).

Bit-score does not depend on database size. The bit-score gives the same value for hits in databases of different sizes and hence can be used for searching in an constantly increasing database.

read more