Target Region Coverage Analysis

The Target Region Coverage Analysis tool makes it easy to evaluate and compare multiple samples with respect to a given coverage metric. The tool takes as input one or more per-region statistics tracks generated by QC for Targeted Sequencing and outputs a target region track providing statistics across the analyzed samples. In addition, an overlay annotation track (for example a gene track) can be provided to obtain a higher-level summary, where target regions are grouped based on overlap, and coverage statistics are calculated for each group.

The QC for Targeted Sequencing tool is described in the CLC Genomics Workbench manual: https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=QC_Targeted_Sequencing.html.

Running the tool

To launch Target Region Coverage Analysis, go to:

        Toolbox | Quality Control (Image quality_control_closed_16_h_p) | Target Region Coverage Analysis (Image target_region_coverage_16_h_p)

In the first dialog (figure 24.24), select one or more per-region statistic tracks (Image annotation_track_16_n_p) produced by QC for Targeted Sequencing. The tracks must be based on the same target region track.

Image targetregioncoverageanalysisstep1
Figure 24.24: Select one or more per-region statistics tracks.

The next dialog allows you to configure the settings for this tool, as shown in figure 24.25 and described below.

Image targetregioncoverageanalysisstep2
Figure 24.25: Settings of the Target Region Coverage Analysis tool.


Output from Target Region Coverage Analysis

Two outputs are produced from the Target Region Coverage Analysis tool:

Target region coverage track

The target region coverage track includes the following annotations:

Annotation coverage track

The annotation coverage track provides combined statistics for target regions overlapping the same annotation region. If the target regions correspond to exons and a gene track is selected as annotation track, all exons within a gene are combined and statistics are reported per gene. For each sample, the metric values from overlapping target regions are combined to a single metric value. The selected metric dictates how values are combined: Min coverage values are combined by taking the minimum, Max coverage values are combined by taking that maximum and Mean coverage and GC% values are combined as a weighted average, where each target region is weighted by its length. Median coverage values are combined by calculating the median of the values, however, it should be noted that this is different from calculating the median of all base position coverage values contained in the set of target regions.

The annotation coverage track includes the following annotations: