Analyze QIAseq Hybrid Capture DNA Somatic (with UMI)
The Analyze QIAseq Hybrid Capture DNA Somatic (with UMI) (Illumina) template workflow can be used for analyzing DNA reads generated using the QIAseq Multimodal DNA/RNA Library Kit and a hybrid capture panel such as QIAseq Exome, QIAseq xHYB Human, QIAseq xHYB CGP DNA, or panels from a third-party provider.
The workflow is suitable for analyzing hybrid capture data with UMIs. Workflows for analyzing other DNA data types generated using the QIAseq Multimodal DNA/RNA Library Kit are described elsewhere in the manual:
- WGS data without UMIs: Analyze QIAseq Somatic WGS (Illumina).
- WGS data with UMIs: Analyze QIAseq Multimodal DNA Library Kit Somatic WGS (with UMI).
- Hybrid capture data without UMIs: Analyze QIAseq Hybrid Capture DNA (Illumina).
The workflow includes all necessary steps for processing and analyzing the reads:
- Target coverage statistics are generated using QC for Targeted Sequencing.
- UMIs are removed using Remove and Annotate with Unique Molecular Index.
- Reads are trimmed using Trim Reads.
- Reads are mapped to the human reference genome using Map Reads to Reference.
- UMI reads are created using Calculate Unique Molecular Index Groups and Create UMI Reads from Grouped Reads.
- A guidance track is generated from the mapped reads using Structural Variant Caller.
- An improved mapping is obtained by realigning the mapped reads using the guidance track and Local Realignment.
- Variants are detected in the relevant target regions from the improved mapping using Low Frequency Variant Detection.
- The variants are annotated with various information, such as overlap with repeat/homopolymer regions and genes, and are subsequently filtered to remove likely artifacts using Filter on Custom Criteria.
- CNVs are optionally detected from the improved mapping using Copy Number Variant Detection (Targeted).
- Fusions are optionally detected using Detect Fusion Genes from DNA.
- A TMB score is optionally calculated using Calculate TMB Score.
- MSI status is optionally detected using Detect MSI Status.
- A summary report is created using Create Sample Report.
- A report is optionally prepared for uploading to QCI Interpret using Prepare QCI Interpret Upload.
Launching the workflow
To run this workflow, go to:
Workflows | Template Workflows | Biomedical Workflows (
) | QIAseq Sample Analysis (
) | QIAseq DNA Workflows (
) | Analyze QIAseq Hybrid Capture DNA Somatic (with UMI) (Illumina) (
)
See Launching workflows individually and in batches for general information.
Options can be configured in the following wizard steps:
- Choose where to run. If you are connected to a CLC Server via the CLC Workbench, you will be asked where you would like to run the analysis. We recommend that you run the analysis on a CLC Server when possible.
- Specify workflow path. The following paths can be configured:
- Sample type. Specify whether the sample corresponds to FFPE or intact gDNA.
- Analysis Options. Specify whether to:
- Analyze sample: Detect variants and optionally perform additional analyses described below.
- Create CNV control: Create coverage table from a normal sample.
- Analyze sample and create CNV control: Perform both.
- Detect fusions. Skipping this may reduce workflow execution time.
- Detect CNVs. Detect copy number variants. This requires CNV controls (i.e. normal samples).
For the QIAseq xHYB CGP DNA panel and the QIAseq xHYB CGP DNA Fusion MSI panel, a CNV coverage table based on 13 different normal samples without X and Y regions is available in the Reference Data Manager. Note that the CNV control corresponding to the QIAseq xHYB CGP DNA Fusion MSI panel is a demo element that has not been optimized.
- Calculate TMB. Calculate a tumor mutation burden score. This is only recommended for panels covering at least 667 kb [Vega et al., 2021].
- Detect MSI. Calculate a microsattelite instability status. This requires an MSI baseline generated from normal samples.
For the QIAseq xHYB CGP DNA panel, an MSI baseline based on 30 normal samples and 176 loci from msisensor2 is available in the Reference Data Manager.
- Prepare for QCI Interpret. Prepare a report that can later be uploaded to QCI Interpret or QCI Interpret Translational.
- Select Reads. Select the input reads. When analyzing more than one sample at a time, check Batch in the lower left corner of the dialog.
- Specify reference data handling.
- Select the QIAseq DNA Hybrid Capture and WGS hg38 Reference Data Set if the data was hybrid captured using the QIAseq Exome or QIAseq xHYB Human panels.
Note that this option is available only when not detecting fusions, CNVs, MSI, or calculating TMB.
- Select the QIAseq DNA xHYB CGP hg38 Reference Data Set if the data was hybrid captured using the QIAseq xHYB CGP DNA panel or the QIAseq xHYB CGP DNA Fusion MSI panel.
- Select the QIAseq DNA Hybrid Capture and WGS hg38 Reference Data Set if the data was hybrid captured using the QIAseq Exome or QIAseq xHYB Human panels.
- Configure batching, if running the workflow in batch mode. Define the batch units.
- Batch overview, if running the workflow in batch mode. Verify that the batching is as intended.
- Target regions. Choose the relevant target regions from the drop down list. If the data was produced using a panel that is not available in the list, see QIAseq custom panels.
- CNV controls. Specify the controls to detect CNVs.
For best results, the controls should match the sample in key experimental parameters, such as sequencing technology and, when targets include regions on the X and/or Y chromosomes, gender. For more details, see Copy Number Variant Detection.
- QC for Targeted Sequencing. Configure the Minimum coverage.
- Low Frequency Variant Detection. Configure the Minimum frequency (%) as needed, if variants should be detected at lower frequencies. Note that the value should be no more than half of the frequency specified when filtering variants.
- Filter Variants. Configure how variants should be filtered. The default options are optimized for samples with relatively high quality and coverage providing a strong balance between sensitivity and precision. For lower-quality or lower-coverage samples, when seeking a different sensitivity-precision balance, or for detecting low-frequency variants, additional filters or adjusted thresholds might be needed. See Filter on Custom Criteria for details on how to configure the options.
- Filter Fusions on Support and Filter Fusions on Gene Distance and Direction. Fusion filters have been tuned using samples of relatively high quality and coverage to provide the best possible sensitivity and precision. Additional filtering may be needed, or filtering values may need to be adjusted, when working with low quality/coverage samples or when seeking a different balance between sensitivity and precision.
- Calculate TMB Score, if calculating a TMB score. Specify whether the report should provide a TMB status based on thresholds, see Calculate TMB Score for more details. Only reports containing a TMB status can be uploaded to QCI Interpret.
- Create Sample Report. Select relevant summary items and configure thresholds for quality control. These are included in the quality control section of the sample report. The default values are appropriate for most data sets, but may need to be adjusted.
- Prepare QCI Interpret Upload, if preparing a QCI Interpret report. Specify sample name, subject ID, and project.
- Result handling. Choose if a workflow result metadata and/or log should be saved.
- Save location for new elements. Choose where to save the data, and press Finish to start the analysis.
Launching using the QIAseq Panel Analysis Assistant
The workflow is also available in the QIAseq Panel Analysis Assistant under xHYB CGP.
Subsections
- Output from Analyze QIAseq Hybrid Capture DNA Somatic (with UMI)
- Compare ID SNP variants across samples
