Trimming

Two types of trimming are available: quality trimming and adapter trimming.

Quality trimming

Raw reads are trimmed for low quality nucleotides. The method is described here: http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Quality_trimming.html.

For LightSpeed Fastq to Germline Variants, the default quality limit used for trimming is 0.05.

For somatic variant calling, ensuring a high base quality is important. Therefore, the default quality limit has been set to 0.01 corresponding to a base quality of 20 in LightSpeed Fastq to Somatic Variants and LightSpeed Fastq to Somatic Variants Tumor Normal. This will have minimal impact on high-quality reads but can lead to markedly shorter reads and hence decreased coverage when using lower-quality reads.

Adapter trimming

The algorithm can trim adapter sequences from mapped paired-end reads. For each individual read in a pair, read sequence that extends beyound the 5' end of the other read in the pair, is considered adapter sequence and is trimmed.

If consensus sequences can be calculated from the adapter sequences removed from R1 and/or R2, these are included in the report. Provided consensus sequences start with the first base that is removed, and continue until it is no longer possible to confidently calculate a consensus.

It is possible to remove trimmed reads that are shorter than a defined threshold after adapter trimming.