Trim Sequences
CLC Cancer Research Workbench offers a number of ways to trim your sequence reads prior to assembly and mapping, including adapter trimming, quality trimming and length trimming. For each original read, the regions of the sequence to be removed for each type of trimming operation are determined independently according to choices made in the trim dialogs. The types of trim operations that can be performed are:
- Quality trimming based on quality scores
- Ambiguity trimming to trim off e.g. stretches of Ns
- Adapter trimming
- Base trim to remove a specified number of bases at either 3' or 5' end of the reads
- Length trimming to remove reads shorter or longer than a specified threshold
The trim operation that removes the largest region of the original read from either end is performed while other trim operations are ignored as they would just remove part of the same region.
Note that this may occasionally expose an internal region in a read that has now become subject to trimming. In such cases, trimming may have to be done more than once.
The result of the trim is a list of sequences that have passed the trim (referred to as the trimmed list below) and optionally a list of the sequences that have been discarded and a summary report (list of discarded sequences). The original data will not be changed.
Adapters
If you are working with sequences that still have adapters present, they can be trimmed using the Trim Sequences tool provided in the "NGS Core tools" folder in the toolbox.
Illumina Adapters
If you have Illumina sequencing data that have been generated with the new adapter sequences and have not been trimmed or have been trimmed incompletely, the adapter sequences can be removed within the CLC Cancer Research Workbench using the Illumina adapter sequences that can be found here:
http://support.illumina.com/downloads/illumina-customer-sequence-letter.html and the tool Trim Sequences () that is available in the Toolbox in the "Tools" section under Preparing Raw Data (). |
To start trimming:
Toolbox | Tools | Preparing Raw Data () | Trim Sequences ()
This opens a dialog where you can add sequences or sequence lists. If you add several sequence lists, each list will be processed separately and you will get a a list of trimmed sequences for each input sequence list.
When the sequences are selected, click Next.
Subsections