OTU's are used to categorize bacteria based on sequence similarity.
In 16S metagenomics approaches, OTUs are cluster of similar sequence variants of the 16S rDNA marker gene sequence. Each of these cluster is intended to represent a taxonomic unit of a bacteria species or genus depending on the sequence similarity threshold. Typically, OTU cluster are defined by a 97% identity threshold of the 16S gene sequences to distinguish bacteria at the genus level.
Species separation requires a higher threshold of 98% or 99% sequence identity, or even better the use of exact amplicon sequence variants (ASV) instead of OTU sequence clusters [see → DADA2].
OTU table (sequence count table)
A OTU table contains the number of sequences that are observed for each taxonomic unit (OTUs) in each samples. Columns usually represent samples and rows represent genera or species specific taxonomic units (OTUs). OTU tables are often saved as → BIOM formatted files.
Limited taxonomic resolution
OTU resolution depends on the 16S approach which has some limits in distinguishing at the species level, for example,
Escherichia coli and Shigella spp. share almost identical 16S rRNA gene sequences.
Alternative approaches are developed to achieve higher resolution up to strain level by considering larger or complete sets of genes.
Multilocus sequence typing (MLST), 5-10 housekeeping genes (sub-species resolution)
Shotgun metagenomic sequencing, identify all genes of a strain present in a sample (strain-level resolution)
Gene sequence alignment tools, used for 16S OTU clustering
OTU clustering versus using exact 16S sequence variants
For taxonomic profiling of whole genome shotgun sequencing (WGS) datasets, see