1) get OTU cluster of similar sequences (97% itentity)
2) select one representative sequences for each OTU
3) annotate OTU cluster with known taxonomy (consider OTU representative sequences without database hits as "unkown")
4) create table (biom format) containing OTU abundance of all samples, and taxonomy information for further analysis Get annotated OTU's (16S)echo 'pick_otus:enable_rev_strand_match True' > otu_settings.txt # create setting fileResults:
otu_table_mc2_w_tax.biom OTU abundance table of all samples, including taxonomy rep_set.fna representative sequences for each OTU cluster rep_set.tre phylogenetic tree based on representative OTU sequenes# install sequence cluster algorithm: → usearch
# set location of reference sequence file "97_otus.fasta"
REFSEQS=/path/to/my/qiime1/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta # allow taxonomic annotation of sequences in reverse orientation echo 'pick_otus:enable_rev_strand_match True' > otu_settings.txt # run otu picking with usearch61
pick_open_reference_otus.py -i $PWD/SEQ/seqs.fna -o $PWD/OTU/ -r $REFSEQS -s 0.1 -m usearch61 -p $PWD/otu_settings.txt Get OTU & read count per sample
|