The Copy Number Variant Detection tool

To start the Copy Number Variant Detection tool, click:

        Toolbox | Resequencing Analysis (Image resequencing) | Copy Number Variant Detection

Select the case read mapping and click Next.

You are now presented with choices regarding the data to use in the CNV prediction method, as shown in figure 26.22.

Image cnv_detection_step1
Figure 26.22: The first step of the CNV detection tool.

Target regions track
An annotation track containing the regions targeted in the experiment must be chosen. This track must not contain overlapping regions, or regions made up of several intervals, because the algorithm is designed to operate on simple genomic regions.
Control mappings
You must specify one or more read mappings, which will be used to create a baseline by the algorithm. For the best results, the controls should be matched with respect to the most important experimental parameters, such as gender and technology. If using non-matched controls, the CNVs reported by the algorithm may be less accurate.
Gene track
Optional: If you wish, you can provide a gene track, which will be used to produce gene-level output as well as CNV-level output.
Ignore non-specific matches
If checked, the algorithm will ignore any non-specifically mapped reads when counting the coverage in the targeted positions. Note: If you are interested in predicting CNVs in repetitive regions, this box should be unchecked.
Ignore broken pairs
If checked, the algorithm will ignore any broken paired reads when counting the coverage in the targeted positions.

Image cnv_detection_step2
Figure 26.23: The second step of the CNV detection tool

Click Next to set the parameters related to the target-level and region-level CNV detection, as shown in as shown in figure 26.23.

Threshold for significance
P-values lower than the threshold for significance will be considered "significant". The higher you set this value, the more CNVs will be predicted.
Minimum fold change, absolute value
You must specify the minimum fold change for a CNV call. If the absolute value of the fold change of a CNV is less than the value specified in this parameter, then the CNV will be filtered from the results, even if it is otherwise statistically significant. For example, if a minimum fold-change of 1.5 is chosen, then the adjusted coverage of the CNV in the case sample must be either 1.5 times higher or 1.5 times lower than the coverage in the baseline, for it to pass the filtering step. If you do not want to filter on the fold-change, enter 0.0 in this field. Also, if your sample purity is less than 100%, it is necessary to take that into account when you adjust the fold-change cutoff. This is described in more detail in section 26.8.1. Note: this value is used to filter the Region-level CNV track. The Target-level CNV track will always include full information for all targets.
Low coverage cutoff
If the average coverage of a target is below this value, it will be considered "low coverage" and it will not be used to set up the statistical models, and p-values will not be calculated for it in the target-level CNV prediction.
Graining level
The graining level is used for the region-level CNV prediction. Coarser graining levels produce longer CNV calls and less noise, and the algorithm will run faster. However, smaller CNVs consisting of only a few targets may be missed at a coarser graining level.
  • Coarse: prefers CNVs consisting of many targets. The algorithm is most sensitive to CNVs spanning over 10 targets. This is the recommended setting if you expect large-scale deletions or insertions, and want a minimal false positive rate.
  • Intermediate: prefers CNVs consisting of an intermediate number of targets. The algorithm is most sensitive to CNVs spanning 5 or more targets. This is the recommended setting if you expect CNVs of intermediate size.
  • Fine: prefers CNVs consisting of fewer targets. The algorithm is most sensitive to CNVs spanning 3 or more targets. This is the recommended setting if you want to detect CNVs that span just a few targets, but the false positive rate may be increased.
Note: The CNV sizes listed above are meant as general guidelines, and are not to be interpreted as hard rules. Finer graining levels will produce larger CNVs when the signals for this are sufficiently clear in the data. Similarly, the coarser graining levels will also be able to predict shorter CNVs under some circumstances, although with a lower sensitivity.

Enhance single-target sensitivity
All of the graining levels assume that a CNV spans more than one target. If you are also interested in very small CNVs that affect down to a single target in your data, check the 'Enhance single-target sensitivity' box. This will increase the sensitivity of detection of very small CNVs, and has the greatest effect in the case of the coarser graining levels. Note however that these small CNV calls are much more likely to be false positives. If this box is unchecked, only larger CNVs supported by several targets will be reported, and the false positive rate will be lower.

Clicking Next, you are presented with options about the results (see figure 26.24). In this step, you can choose to create an algorithm report by checking the Create algorithm report box. Furthermore, you can choose to output results for every target in your input, by checking the Create target-level CNV track box.

Image cnv_detection_savestep
Figure 26.24: Specifying whether an algorithm report and a target-level CNV track should be created.

When finished with the settings, click Next to start the algorithm.