Run the Identify Variants (TAS) workflow
- To run the Identify Variants (TAS) template workflow, go to:
Workflows | Template Workflows | Biomedical Workflows (
) | Targeted Amplicon Sequencing (
) | Somatic Cancer (
) | Identify Variants (TAS) (
)
- Select the sequencing reads from the sample that should be analyzed (figure 20.27).
Figure 20.27: Please select all sequencing reads from the sample to be analyzed.If several samples should be analyzed, the tool has to be run in batch mode. This is done by checking "Batch" and selecting the folder that holds the data you wish to analyze.
- In the Target regions dialog you specify the target regions for your application (figure 20.28). The variant calling will be restricted to these regions.
Figure 20.28: Select the track with the targeted regions from your experiment. - In the next dialog, you have to select which reference data set should be used to identify variants (figure 20.29).
Figure 20.29: Choose the relevant reference Data Set to identify variants in your sample. - In the next wizard step (figure 20.30) you can specify the parameters for variant detection.
Figure 20.30: Specify the parameters for variant detection. - In the next wizard step (figure 20.31) you specify the parameters for the QC reporting on the targeted regions.
Figure 20.31: Specify minimum coverage for the QC reporting on the targeted regions. - In the last wizard step you can check the selected settings by clicking on the button labeled Preview All Parameters.
In the Preview All Parameters wizard you can only check the settings, and if you wish to make changes you have to use the Previous button from the wizard to edit parameters in the relevant windows.
- Choose to Save your results and click Finish.
Output from the Identify Variants (TAS) workflow
The Identify Variants (TAS) tool produces five different types of output:
- Read Mapping (
) The mapped sequencing reads. The reads are shown in different colors depending on their orientation, whether they are single reads or paired reads, and whether they map unambiguously (see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Coloring_mapped_reads.html).
- Target Regions Coverage (
) The target regions coverage track shows the coverage of the targeted regions. Detailed information about coverage and read count can be found in the table format, which can be opened by pressing the table icon found in the lower left corner of the View Area.
- Target Regions Coverage Report (
) The report consists of a number of tables and graphs that in different ways provide information about the targeted regions.
- Identified Variants and Indels Indirect Evidence (
) A variant track containing the variants identified by the Low Frequency Variant Detection tool, and a variant track with the indels inferred from indirect evidence by the Structural Variant Caller. The variants can be shown in track format or in table format. When holding the mouse over the detected variants in the Track List, a tooltip appears with information about the individual variants. You will have to zoom in on the variants to be able to see the detailed tooltip.
- Genome Browser View Identify Variants (
) A collection of tracks presented together. Shows the annotated variant track together with the human reference sequence, genes, transcripts, coding regions, the mapped reads, the identified variants, and the structural variants (see figure 20.32).
It is important that you do not delete any of the produced files individually as some of the outputs are linked to other outputs. If you would like to delete the outputs, please always delete all of them at the same time.
First have a look at the mapping report to see if the coverage is sufficient in regions of interest (e.g. > 30 ). Furthermore, check that at least 90% of reads are mapped to the human reference sequence. In case of a targeted experiment, also check that the majority of reads are mapping to the targeted region.
Afterwards please open the Genome Browser View file (see 20.32).
The Genome Browser View includes the track of identified variants in context to the human reference sequence, genes, transcripts, coding regions, targeted regions and mapped sequencing reads.
Figure 20.32: The Genome Browser View allows you to inspect the identified variants in the context of the human genome.
Open the variant track as a table to see information about all identified variants (see 20.33).
Figure 20.33: Genome Browser View with an open track table to inspect identified variants more closely in
the context of the human genome.