Perform QIAseq Multimodal Panel Analysis with TMB and MSI
The Perform QIAseq Multimodal Panel Analysis TMB and MSI (Illumina) template workflow can be used for analyzing DNA and/or RNA reads generated using the QIAseq Multimodal Pan Cancer panel (UHS-5000Z or QHS-5000Z). The workflow extends the Perform QIAseq Multimodal Panel Analysis (Illumina) template workflow, to calculate a TMB score as well as assess MSI status.
Note that the QIAseq Multimodal Panels are designed against genome build hg19 for the DNA panel and hg38 for the RNA panel. BED files are provided in the respective genome build. However, the template workflow requires that reference data for both DNA and RNA is for the same genome build. The two QIAseq Multimodal Reference Data Sets provided by the Reference Data Manager are for genome build hg38, where the reference data for the DNA panel has been converted to hg38 as described below.
For custom panels, the DNA panel BED file needs to be imported against hg19, after which it should be converted to hg38 using the tool Convert Annotation Track Coordinates. If many regions are lost during conversion, it can cause reads to be discarded that would have otherwise mapped to the lost target regions. To avoid such issues, a copy of the template workflow can be used, containing only the analysis of the DNA reads, and the workflow should be run using the imported BED file against hg19. |
Launching the workflow
To run this workflow, go to:
Workflows | Template Workflows | Biomedical Workflows () | QIAseq Sample Analysis () | Other QIAseq workflows () | Perform QIAseq Multimodal Panel Analysis TMB and MSI (Illumina) ()
For general information about launching workflows, see https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Launching_workflows_individually_in_batches.html
Options can be configured in the following dialogs:
- Choose where to run. If you are connected to a CLC Server via the CLC Workbench, you will be asked where you would like to run the analysis. We recommend that you run the analysis on a CLC Server when possible.
- Specify workflow path. Select whether you want to analyze DNA and/or RNA reads.
- Select DNA/RNA Reads. Select the reads. When analyzing more than one sample at a time, check the Batch checkbox in the lower left corner of the dialog.
- Specify reference data handling. Select the QIAseq Multimodal Pan Cancer hg38 Reference Data Set, see Reference data management for details.
- Configure batching. If running the workflow in Batch mode, you will be asked to define the batch units. See https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Running_workflows_in_batch_mode.html for details.
- MSI baseline. A default MSI baseline from the Reference Data Set is provided for this workflow, but this is for demo purpose only and will not give the true microsatellite instability status. We recommend that the MSI baseline is generated using samples that are sequenced under the same lab conditions as the multimodal samples (see Generate MSI Baseline).
- Map Reads to Reference.
Configure masking.
By default, the GenomeReferenceConsortium_masking_hg38_no_alt_analysis_set masking track is selected, containing the regions defined by the Genome Reference Consortium, which serve primarily to remove false duplications, including one affecting the gene U2AF1. Changing the masking mode from 'No masking' to 'Exclude annotated' excludes these regions.
- QC for Target Sequencing. Specify the minimum coverage needed on all positions in a target for it to be considered covered. For somatic calling, we recommend setting this to at least 100x
- Copy Number Variant Detection (Targeted). Specify a control mapping against which the coverage pattern in your sample will be compared in order to call CNVs. If you do not specify a control mapping, or if the target regions files contains fewer than 50 regions, the Copy Number Variation analysis will not be carried out.
- Remove False Positives (filter on allele frequency).
Specify the minimum frequency of detected variants.
The frequency cutoff is the only open parameter in this workflow and the workflow can detect down to 1% variant frequency. Even when setting the frequency lower it will not output lower frequencies as the variant calling is initially done down to 1% by the variant caller in this workflow. Further adjustments needs to be done by opening a copy of the workflow. Optionally calculate TMB status based on a low and a high threshold.
Note that the default values of 10 and 15 have been chosen based on internal benchmark analyses and should be set according to the samples analyzed.
- Detect and Refine Fusion Genes.
Configure the following options as needed:
- Detect exon skippings
- Detect novel exon boundaries
- Detect novel exon boundaries in both genes
- Gene filter action
- Genes for filtering (tracks)
- Fusion filter action
- Fusions for filtering (tables)
For details about the elements used by default in 'Genes for filtering (tracks)' and 'Fusions for filtering (tables)', see https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Exclude_lists.html.
For general details about fusion detection, see https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Detect_Refine_Fusion_Genes.html.
- Result handling. Choose if a workflow result metadata and/or log should be saved.
- Save location for new elements. Choose where to save the data, and press Finish to start the analysis.
Launching using the QIAseq Panel Analysis Assistant
The workflow is also available in the QIAseq Panel Analysis Assistant under Multimodal Panels.
Subsections