Running the duplicate reads removal

The tool is found in the Toolbox:

        Toolbox | Resequencing Analysis (Image resequencing) | Remove Duplicate Mapped Reads (Image remove_duplicates)

This opens a dialog where you can select mapping results in reads tracks (Image read_track_16_n_p) format. Clicking Next allows you to set the threshold parameters as displayed in figure 27.44.

Image removemappedduplicatereads-step2
Figure 27.44: Setting the stringency for merging similar reads.

The parameter is explained in detail in Algorithm details and parameters.

Clicking Next will reveal the output options. The main output is a list of the reads that remain after the duplicates have been removed. In addition, you can get the following output:

List of duplicate sequences
These are the sequences that have been removed.
Report
This is a brief summary report with the number of reads that have been removed (see an example in figure 27.45).

Note! The Remove Duplicate Mapped Reads tool may run this before or after local realignment. The order in which these two tools are run should make little if any difference.

Image duplicate-mapped-reads-removal-report
Figure 27.45: Summary statistics on the duplicate mapped reads.