Download short-read sequencing data from NCBI Sequence Read Archive (SRA)

Install SRA tools

→ Install SRA-toolkit (fasterq-dump, prefetch,... )

Pre-download using prefetch (optional)

To separate download and conversion tasks, SRA files can be downloaded in advance by using prefetch.

→ SRA pre-download (using prefetch)

Download SRA samples as FASTQ using fasterq-dump

Example: download NCBI-SRA sample SRR649944 and save sequence data in location FASTQ_files/

Additionally using gzip to compress .fastq files

fasterq-dump --split-3 SRR649944 -O FASTQ_files/

gzip FASTQ_files/*.fastq

ls FASTQ_files/




# filter read-length of SRA samples

fasterq-dump --split-3 SRR649944 -O FASTQ_files/ --min-read-len 80

option: --min-read-len 80 extracts only reads >= 80bp from SRA file

# add a large and fast temporary file location (RAM or SSD disk) used for tmp files during .sra to .fastq conversion

fasterq-dump --split-3 SRR649944 -O FASTQ_files/ -t /tmp/scratch/

# change temporary SRA download location

fasterq-dump downloads a temporary SRA file into the default directory $TMPDIR ( usually TMPDIR=/tmp/ ). To avoid that SRA files exceed the space limit of the local temporary directory, another download location can be defined as CACHE in → SRA-toolkit configuration.

Read more

fasterq-dump --help