Consensus sequence
get consensus sequence (of most frequent bases) based on short reads, mapped against a reference sequence (gene or complete genome)
1) Map short reads against reference gene sequence
# Create bowtie2 database
bowtie2-build REFERENCE.fasta REF_DB
# bowtie2 mapping
bowtie2 -x REF_DB -U SAMPLE.fastq --no-unal -S SAMPLE.sam
# samtools: sort .sam file and convert to .bam file
samtools view -bS SAMPLE.sam | samtools sort - -o SAMPLE.bam
2) Get consensus sequence from .bam file
# Get consensus fastq file
samtools mpileup -uf REFERENCE.fasta SAMPLE.bam | bcftools call -c | vcfutils.pl vcf2fq > SAMPLE_cns.fastq
# vcfutils.pl is part of bcftools
# Convert .fastq to .fasta and set bases of quality lower than 20 to N
seqtk seq -aQ64 -q20 -n N SAMPLE_cns.fastq > SAMPLE_cns.fasta
see also
https://samtools.github.io/bcftools/howtos/consensus-sequence.html