Adapter trimming
Clicking Next will allow you to specify adapter trimming (figure 22.2) .
Figure 22.2: Trimming your sequencing data for adapter sequences.
Select an trim adapter list (see Creating a new Trim adapter list on how to create an adapter list) that defines the adapters to use.
You can specify if the adapter trimming should be performed in Color space. Note that this option is only available for sequencing data imported using the SOLiD import. When doing the trimming in color space, the Smith-Waterman alignment is simply done using colors rather than bases. The adapter sequence is still input in base space, and the Workbench then infers the color codes. Note that the scoring thresholds apply to the color space alignment (this means that a perfect match of 10 bases would get a score of 9 because 10 bases are represented by 9 color residues). Learn more about color space.
Checking the Search on both strands checkbox will search both the minus and plus strand for the adapter sequence. Note! If a match is found on the reverse strand the Trim action will reverse complement the read before trimming and output the trimmed reverse complement. Its intended use is for removal of multiplexing barcodes and primers.
Below you find a preview listing the results of trimming with the current settings on 1000 reads in the input file (reads 1001-2000 when the read file is long enough). This is useful for a quick feedback on how changes in the parameters affect the trimming (rather than having to run the full analysis several times to identify a good parameter set). The following information is shown:
- Name. The name of the adapter.
- Matches found. Number of matches found based on the strand and alignment score settings.
- Reads discarded. This is the number of reads that will be completely discarded. This can either be because they are completely trimmed (when the Action is set to Remove adapter and the match is found at the 3' end of the read), or when the Action is set to Discard when found or Discard when not found.
- Nucleotides removed. The number of nucleotides that are trimmed include both the ones coming from the reads that are discarded and the ones coming from the parts of the reads that are trimmed off.
- Avg. length This is the average length of the reads that are retained (excluding the ones that are discarded).