Analyze QIAseq xHYB NTM-ID Panel Data (Human host)

The Analyze QIAseq xHYB NTM-ID Panel Data (Human host) template workflow detects and types species from the Mycobacteriaceae family. It is suitable for analysis of samples from human hosts generated with the QIAseq xHYB NTM-ID Panel and can detect both the presence of Mycobacterium tuberculosis and Non-Tuberculosis Mycobacteria (NTM) by targeting the 65-kDa heat shock protein (hsp65) gene.

If the QIAseq xHYB NTM-ID Panel was used in conjunction with the QIAseq xHYB Mycobacterium tuberculosis Panel, use the template workflow Analyze QIAseq xHYB Mycobacterium Tuberculosis Panel Data (Human host) instead.

To analyze samples not from human hosts, you can create a copy of the workflow and edit it to fit your specific application, see Template workflows. Since the workflow element QC for Targeted Sequencing is relevant for human data only, you should delete this. In addition, if a host genome is not relevant for you application, you can remove the "Host reference" input from the Find Best References using Read Mapping step.
Once the workflow copy is customized, you can install it to make it available from under the Workflows menu (see Workflow installation).

QIAGEN Reference Data Set

The QIAseq xHYB NTM-ID Panel Reference Data Set contains reference data relevant for this template workflow. It includes a non-redundant reference database of the hsp65 gene, used for detection and typing of Mycobacteriaceae. Like the template workflow, the reference data set is designed for human samples, and additionally contains a human host reference and an annotation track of human control gene regions.

Data in the QIAseq xHYB NTM-ID Panel set not already downloaded can be downloaded during the launch of the workflow. It can also be downloaded, as well as managed, using the Reference Data Manager, which can be opened by clicking on the Manage Reference Data (Image referencemanager_16_n_p) button in the Toolbar. Click on the QIAGEN Sets Reference Data Library tab in the Reference Data Manager and search for the set by entering terms from its name in the search field.

For analysis of samples not from human hosts: If a non-human host is relevant for your application, you can download a host genome using Download Custom Microbial Reference Database.

The workflow analysis

The raw reads are trimmed for low quality, read-through adapter sequences, and G homopolymers. If a Trim adapter list is supplied, these adapters will also be trimmed.

Trimmed reads are mapped to the references of Mycobacteriaceae hsp65 genes and the human host reference simultaneously using Find Best References using Read Mapping. Due to the high level of similarity between hsp65 genes from different Mycobacteriaceae species, the reads are mapped with stringent mapping parameters.

This results in an intial set of hsp65 reads and possible references. If more than one possible reference is detected for the sample reads, the analysis will try to refine the references by only looking at non-ambiguous reads mapping to this subset of the references. This helps to resolve false positive species calls as a result of the high level of similarity within the target gene.

While the detected species may contain a "variant" name (e.g. "Mycobacterium tuberculosis variant bovis"), be advised that the hsp65 gene is usually not specific enough for strain level typing - only species level typing. For mixed infections involving more than one Mycobacteriaceae species, the lower detection limit is 3% abundance relative to the most abundant species.

After reference refinement, all of the hsp65 reads will be re-mapped to the final refined list of references, and the detected species and read mapping statistics are output in the report.

The human control gene regions are used for QC for Targeted Sequencing. The QIAseq xHYB NTM-ID Panel contains probes for these regions as an indicator of succesful hybrid capture.

Launching the workflow

The Analyze QIAseq xHYB NTM-ID Panel Data (Human host) workflow is available at:

        Workflows | Template Workflows (Image workflow_group) | Microbial Workflows (Image mgm_folder_closed_flat_16_h_p) | QIAseq Analysis (Image qiaseq_workflows_folder_closed_16_n_p) | Analyze QIAseq xHYB NTM-ID Panel Data (Human host) (Image bacteria_hybrid_capture_16_n_p)

Launch the workflow and step through the wizard.

  1. Select the sequence list(s) containing the sample reads. If selecting multiple inputs from different samples, check the Batch option, see Running workflows in batch mode.
  2. Select a reference data set or select "Use specified data elements". The latter runs the workflow using default elements, which can be viewed by clicking the "workflow roles" text just above the option.
  3. If Batch was checked in step 1, choose whether batch units should be defined based on organization of the input data, or by provided metadata. In the next step, review the batch units resulting from your selections above.
  4. If your reads contain adapters, add an appropriate Trim adapter list. Click Next.
  5. The parameters for filtering references can be changed (figure 2.34). This might be necessary if the expected Mycobacteriaceae species is present in the sample at a very low abundance. The default settings are expected to work in most cases. For more information about the filters, see Find Best References using Read Mapping.
  6. In the "Create Sample Report" step various summary items have been set. These are guidelines to help evaluate the quality of the results (see Create Sample Report). Thresholds can be changed, if the defaults are too stringent for the input samples.
  7. Finally, select a location to save outputs to.

Image ntm_filter_refs
Figure 2.34: Parameters for filtering references can be changed.

Workflow outputs and how to interpret

The outputs provided by the workflow are:

The Typing Report is the main output of the workflow. This allows for easy overview of the analysis results, both in terms of quality control and detected Mycobacteriaceae for the sample. An example of the report can be seen in figure 2.35.

Image ntm_report
Figure 2.35: An example report from the Analyze QIAseq xHYB NTM-ID Panel Data (Human host) workflow.

The report contains the following sections:

The "QC for Mycobacteriaceae mapping" table report contains the following columns: