Introduction to the Identify QIAseq DNA Variants workflows
The Identify QIAseq DNA Variants template workflows are optimized to work with either somatic or germline applications from Illumina or Ion Torrent reads.
Two different types of panels are available for QIAseq Targeted DNA analysis, QIAseq Targeted DNA panels and QIAseq Targeted DNA Pro panels. The read structure is different between the two types of panels, and it is therefore important to choose the correct workflow to allow proper trimming and UMI grouping of the reads. Panel IDs for QIAseq Targeted DNA applications start with DHS or CDHS whereas panel IDs for QIAseq Targeted DNA Pro applications start with PHS or CPHS.
The workflows handling the two types of QIAseq panels are very similar, but default tool settings and the order of tools in the variant filtering cascades differ.
- General differences between QIAseq DNA and QIAseq DNA Pro analysis workflows:
- A number of settings in the two tools Remove and Annotate with Unique Molecular Index and Trim reads differ, as they have been set up to handle reads from the relevant type of QIAseq panel appropriately.
- In Pro workflows, an additional base after the primer is unaligned.
- QIAseq DNA panels are designed against hg19, whereas QIAseq DNA Pro panels are designed against hg38. Consequently, using default settings, reads are mapped to hg19 or hg38, as relevant. In Pro workflows, it is possible to mask regions that are potentially false duplications using the GenomeReferenceConsortium_masking_hg38_no_alt_analysis_set masking track during read mapping. Read about the masking track here: http://genomeref.blogspot.com/2021/07/one-of-these-things-doest-belong.html.
- Differences between Illumina QIAseq DNA and Illumina QIAseq DNA Pro analysis workflows:
- In QIAseq DNA workflows, the minimum read length after trimming is set to 20. This has been increased to 40 in the QIAseq DNA Pro workflows.
- The filtering cascades used for germline variant filtering varies widely between QIAseq DNA and QIAseq DNA Pro analysis workflows. Whereas the QIAseq DNA workflow has an extensive series of filtering steps, the QIAseq DNA Pro workflow has a relatively simple filtering cascade.
- Differences between Ion Torrent QIAseq DNA and Ion Torrent QIAseq DNA Pro analysis workflows:
- In QIAseq DNA workflows, the mismatch cost and the insertion/deletion open and extend costs in Map Reads to References are 2, 6, 1, respectively. These have been increased to 6, 8, 2, respectively, in the QIAseq DNA Pro workflows.
- In QIAseq DNA workflows, the Minimum supporting consensus fraction in Create UMI Reads from Grouped Reads is 0.0. This has been increased to 0.5 in the QIAseq DNA Pro workflows.
- In workflows for somatic variant calling, the variant frequency in Remove False Positives is set to 0.5 in the QIAseq DNA workflow and 2 in the QIAseq DNA Pro workflow.
In the following, Identify QIAseq DNA and Identify QIAseq DNA Pro workflows are described together and are only mentioned specifically when there is a relevant difference.
To support QIAseq Targeted DNA analysis, the following workflows are available:
- Identify QIAseq DNA Somatic Variants (Illumina)
- Identify QIAseq DNA Somatic Variants (Ion Torrent)
- Identify QIAseq DNA Germline Variants (Illumina)
- Identify QIAseq DNA Germline Variants (Ion Torrent)
- Identify QIAseq DNA Somatic and Germline Variants from Tumor Normal Pair (Illumina)
To support QIAseq Targeted DNA Pro analysis, the following workflows are available:
- Identify QIAseq DNA Pro Somatic Variants (Illumina)
- Identify QIAseq DNA Pro Somatic Variants (Ion Torrent)
- Identify QIAseq DNA Pro Germline Variants (Illumina)
- Identify QIAseq DNA Pro Germline Variants (Ion Torrent)
Note that the Identify QIAseq DNA Somatic and Germline Variants from Tumor Normal Pair (Illumina) differs from the other QIAseq DNA workflows by calling both somatic and germline variants in the same workflow and is described separately in Identify QIAseq DNA Somatic and Germline Variants from Tumor Normal Pair (Illumina).
Somatic/germline specificity: For somatic variant detection, the template workflow uses the Low Frequency Variant Detection tool, a variant caller that does not base its statistical model on a bi-allelic assumption. This variant caller will thus declare a site heterozygous if it detects more than one allele at that site, even if one of the alleles is detected at very low frequency and later filtered out. For germline applications, the workflows use the Fixed Ploidy Variant Detection tool. This variant caller has higher precision than the Low frequency Variant Detection tool, particularly at low to moderate levels of coverage (< 30x). At high levels of coverage (>100x) the Fixed Ploidy Variant Detection tool will exhibit low sensitivity for variants with allele frequencies far from what is expected for germline variants (that is 50 or 100%). For more information about the variant callers, please see: http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Low_Frequency_Variant_Detection.html and http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Fixed_Ploidy_Variant_Detection.html.
Illumina/Ion Torrent specificity: Among various differences in the filtering strategy applied in the workflows aimed at analyzing data from a particular sequencing technology, the workflow for Ion Torrent data includes an extra step that removes non SNV type variants that are likely due to artifacts.
In each case, the parameter values applied as defaults have been optimized for high sensitivity and specificity when detecting variants.
The following description applies to the Identify QIAseq DNA (Pro) Variants template workflows optimized for calling either somatic or germline variants:
The QIAseq DNA workflows use the Reference Data set QIAseq DNA Panels hg19 whereas the QIAseq DNA Pro workflows use QIAseq DNA Pro Panels hg38. Before starting one of the workflows for the first time, open the Reference Data Manager and select and download the relevant reference data set if you have not already done so.