Filter Against Control Reads

Running the variant caller on a case and control sample separately and filtering away variants found in the control data set does not always give a satisfactory result as many variants in the control sample have not been called. This is often due to lack of read coverage in the corresponding regions or too stringent parameter settings. Therefore, instead of calling variants in the control sample, the Filter Against Control Reads tool can be used to remove variants found in both samples from the set of candidate variants identified in the case sample.

        Toolbox | Resequencing (Image resequencing) | Compare Variants | Filter against Control Reads

The variant track from the case sample must be used as input. When clicking Next, you are asked to supply the number of reads in the control data set that should support the variant allele in order to include it as a match (see figure 25.63). All the variants where at least this number of control reads show the particular allele will be filtered away in the result track.

Please note that variants, which have no coverage in the mapped control reads will be reported too. You can identify them by looking for a 0 value in the column 'Control coverage'.

The following annotations will be added to each variant not found in the control data set:

Control count
For each allele the number of reads supporting the allele.
Control coverage
Read coverage in the control dataset for the position in which the allele has been identified in the case dataset.
Control frequency
Percentage of reads supporting the allele in the control sample.

Image control_reads_step2
Figure 25.63: The control reads data set.

The filter option can be used to set a threshold for which variants should be kept. In the dialog shown in figure 25.63 the threshold is set at two. This means that if a variant is found in one or less of the control reads, it will be kept.