Fastq quality scoring
The clc_quality_trim program uses an offset value of 64 by default for Illumina data (fastq). You will need to know what version of the Illumina pipeline was used on your original data and set the appropriate offset accordingly, using the -f option. The offset values for standard formats, which are also used in the CLC Workbench, are:- NCBI/Sanger & Illumina Pipeline 1.8 and later: 33
- Illumina Pipeline 1.2 and earlier: 55
- Illumina Pipeline 1.3 and 1.4: 64
- Illumina Pipeline 1.5 to 1.7: 66
Hence, for example, the following command would stipulate a minimum quality value of 10, with a maximum tolerance of 10% bad bases, and an offset of 33. The program will return the longest region for each read that fulfills these criteria. Reads that do not have regions that make the criteria cutoffs will be discarded.
quality_trim -r smallfile.fastq -c 10 -f 33 -o smallfile_trimmed.fasta