QC for Read Mapping

Note that this tool can be used to create a detailed report on a read mapping or a de novo assembly.

To create a detailed mapping report:

        Toolbox | Resequencing Analysis (Image resequencing) | Quality Control (Image quality_control_closed_16_h_p) | QC for Read Mapping (Image proteinreport)

This opens a dialog where you can select mapping results (Image contig)/ (Image multicontig)/ (Image read_track_16_n_p), simples contigs from a de novo assembly, or RNA-Seq analysis results (Image rnaseq).

In the next wizard window "Set contig group" (figure 27.11), you can set thresholds for grouping long and short contigs. Inputs for which thresholds can be specified are simple contigs or stand-alone mappings generated by a tool that works with de novo assemblies. These options are disabled and greyed out if you are working with a read mapping or an RNA-Seq analysis mapping.

Image contig_report_step2
Figure 27.11: Parameters for mapping reports.

The grouping is used to show statistics (e.g., number of contigs, mean length) for the contigs in each group. Note that the de novo assembly in the CLC Genomics Workbench per default only reports contigs longer than 200 bp (this can be changed when running the assembly).

In the last dialog (figure 27.12), by checking "Create table with statistics for each mapping", you can create a table showing detailed statistics for each reference sequence (for de novo results the contigs act as reference sequences, so it will be one row per contig).

Image contig_report_step3
Figure 27.12: Result handling options.

The first section of the detailed report is a summary of the statistics: reference count, type, total reference length, GC contents in %, total read count, mean read length, and total read length

The rest of the report, as well as the optional statistic tables are described in the following sections.