Analyze QIAseq DNA and QIAseq DNA Pro
The template workflows described here can be used for analyzing DNA reads generated using a QIAseq Targeted DNA or QIAseq Targeted DNA Pro panel.
The following workflows are available:
- For analysis of QIAseq Targeted DNA data:
- Analyze QIAseq DNA Germline (Illumina)
- Analyze QIAseq DNA Germline (Ion Torrent)
- Analyze QIAseq DNA Somatic (Illumina)
- Analyze QIAseq DNA Somatic (Ion Torrent)
- For analysis of QIAseq Targeted DNA Pro data:
- Analyze QIAseq DNA Pro Germline (Illumina)
- Analyze QIAseq DNA Pro Germline (Ion Torrent)
- Analyze QIAseq DNA Pro Somatic (Illumina)
- Analyze QIAseq DNA Pro Somatic (Ion Torrent)
In the sections that follow, the workflows are referenced individually only where relevant differences occur.
The workflows include all necessary steps for processing and analyzing the reads:
- Target coverage statistics are generated using QC for Targeted Sequencing.
- UMIs are removed using Remove and Annotate with Unique Molecular Index.
- Reads are trimmed using Trim Reads.
- Reads are mapped to the human reference genome using Map Reads to Reference.
- UMI reads are created using Calculate Unique Molecular Index Groups and Create UMI Reads from Grouped Reads.
- A guidance track is generated from the mapped reads using Structural Variant Caller.
- An improved mapping is obtained by realigning the mapped reads using the guidance track and Local Realignment.
- Primers are trimmed in the improved mapping using Trim Primers of Mapped Reads.
- Variants are detected in the relevant target regions from the improved mapping using Low Frequency Variant Detection for somatic workflows and Fixed Ploidy Variant Detection for germline workflows.
- The variants are annotated with various information, such as overlap with repeat/homopolymer regions and genes, and are subsequently filtered to remove likely artifacts using Filter on Custom Criteria.
- CNVs are optionally detected from the improved mapping using Copy Number Variant Detection (Targeted).
- A TMB score is optionally calculated using Calculate TMB Score.
- MSI status is optionally detected using Detect MSI Status.
- Regional ploidy and loss-of-heterozygosity is optionally detected using Detect Regional Ploidy.
- A homologous recombination deficiency score is optionally calculated using Calculate HRD Score.
- A summary report is created using Create Sample Report.
- For somatic workflows, a report is optionally prepared for uploading to QCI Interpret using Prepare QCI Interpret Upload.
Launching the workflow
To run these workflows, go to:
Workflows | Template Workflows | Biomedical Workflows (
) | QIAseq Sample Analysis (
) | QIAseq DNA workflows (
)
and select:
Analyze QIAseq DNA Somatic/Germline (Illumina/Ion Torrent) (
)
Analyze QIAseq DNA Pro Somatic/Germline (Illumina/Ion Torrent) (
)
See Launching workflows individually and in batches for general information.
Options can be configured in the following wizard steps:
- Choose where to run. If you are connected to a CLC Server via the CLC Workbench, you will be asked where you would like to run the analysis. We recommend that you run the analysis on a CLC Server when possible.
- Specify workflow path. The following paths can be configured:
- Analysis Options. Specify whether to:
- Analyze sample: Detect variants and optionally perform additional analyses described below.
- Create CNV control: Create coverage table from a normal sample.
- Analyze sample and create CNV control: Perform both.
- Detect CNVs. Detect copy number variants. This requires CNV controls (i.e. normal samples).
- Calculate TMB, for DNA Somatic workflows. Calculate a tumor mutation burden score. This is only recommended for panels covering at least 667 kb [Vega et al., 2021]. Only the DHS-8800Z Human TMB and MSI panel meets this specification.
- Detect MSI, for somatic workflows. Calculate a microsattelite instability status. This requires an MSI baseline generated from normal samples.
This is only recommended for data generated using the following panels:
- DHS-8800Z Human TMB and MSI Panel
- PHS-001Z Breast Cancer Research Panel
- PHS-002Z Colorectal Cancer Research Panel
- PHS-101Z Breast Cancer Focus Panel
- PHS-102Z Colorectal Cancer Focus Panel
- PHS-202Z Hereditary Colorectal Cancer Panel
- PHS-205Z Hereditary Pancreatic Cancer Panel
- PHS-3000Z Comprehensive Cancer Research Panel
- PHS-3100Z Comprehensive Cancer Focus Panel
- PHS-3200Z Comprehensive Hereditary Cancer Research Panel
- Detect LOH, for DNA Pro Somatic (Illumina) workflows. Detect regional ploidy and loss-of-heterozygosity. This requires that CNVs are also detected.
This is only recommended for data generated using the following panels:
- PHS-004Z Brain Cancer Research Panel
- PHS-104Z Brain Cancer Focus Panel
- PHS-3000Z Comprehensive Cancer Research Panel
- PHS-3100 Comprehensive Cancer Focus Panel
- Detect LOH and HRD, for DNA Somatic (Illumina) workflows. Detect regional ploidy and loss-of-heterozygosity, and calculate a homologous recombination deficiency score. This requires that CNVs are also detected.
- Prepare for QCI Interpret, for somatic workflows. Prepare a report that can later be uploaded to QCI Interpret or QCI Interpret Translational.
- Analysis Options. Specify whether to:
- Select Reads. Select the input reads. When analyzing more than one sample at a time, check Batch in the lower left corner of the dialog.
- Specify reference data handling. Select the relevant Reference Data Set, see Reference data management for details. For QIAseq Targeted DNA workflows, QIAseq DNA Panels hg19 or QIAseq TMB Panels hg38 will be pre-selected, depending on the panel, whereas for QIAseq Targeted DNA Pro workflows, QIAseq DNA Pro Panels hg38 will be pre-selected.
- Configure batching, if running the workflow in batch mode. Define the batch units.
- Batch overview, if running the workflow in batch mode. Verify that the batching is as intended.
- Target regions. Choose the relevant target regions from the drop down list. If the data was produced using a panel that is not available in the list, see QIAseq custom panels.
- Target primers. Choose the relevant primers from the drop down list.
- Mispriming events. Optionally select a mispriming events track to improve primer trimming. The track can be generated using Identify Mispriming Events.
- Gene-pseudogene track. Optionally select a track with gene-pseudogene pairs to improve primer trimming.
- MSI baseline, if detecting MSI.
Select an MSI baseline. If none is provided, MSI detection will not be performed.
For best results, the baseline should match the sample in key experimental parameters, such as sequencing technology. For more details and a list of baselines available in the Reference Data Manager, see Detect MSI Status.
- Map Reads to Reference. Configure masking. By default, regions defined by the Genome Reference Consortium are excluded to remove false duplications, including one involving U2AF1. This step is available for QIAseq DNA Pro workflows only.
- QC for Targeted Sequencing. Configure the Minimum coverage. Note that the default value for this tool depends on the application chosen (somatic or germline).
- Copy Number Variant Detection (Targeted), if detecting CNVs.
Specify Controls for detecting CNVs. If none are provided, CNV detection will not be performed.
For best results, the controls should match the sample in key experimental parameters, such as sequencing technology and, when targets include regions on the X and/or Y chromosomes, gender. For more details, see Copy Number Variant Detection.
- Calculate TMB Score, if calculating a TMB score. Specify whether the report should provide a TMB status based on thresholds, see Calculate TMB Score for more details. Only reports containing a TMB status can be uploaded to QCI Interpret.
- Filter on Custom Criteria. Configure how variants should be filtered. The default options are optimized for samples with relatively high quality and coverage providing a strong balance between sensitivity and precision. For lower-quality or lower-coverage samples, when seeking a different sensitivity-precision balance, or for detecting low-frequency variants, additional filters or adjusted thresholds might be needed. See Filter on Custom Criteria for details on how to configure the options.
Note that selected default values vary by technology (Illumina/Ion Torrent), application (somatic/germline), and panel type (Targeted DNA/Targeted DNA Pro).
- Create Sample Report. Select relevant summary items and configure thresholds for quality control. These are included in the quality control section of the sample report. The default values are appropriate for most data sets, but may need to be adjusted.
- Prepare QCI Interpret Upload, if preparing a QCI Interpret report. Specify sample name, subject ID, and project.
- Result handling. Choose if a workflow result metadata and/or log should be saved.
- Save location for new elements. Choose where to save the data, and press Finish to start the analysis.
Note that reads that span the origin of the MT chromosome are not trimmed by Trim Primers of Mapped Reads when running the Identify QIAseq DNA Variants template workflows on data from the DHS-105Z panel.
Launching using the QIAseq Panel Analysis Assistant
The workflows are also available in the QIAseq Panel Analysis Assistant under Targeted DNA and Targeted DNA Pro.
Subsections
