Calculate TMB Score

Calculate TMB Score takes a variant track and the set of regions to focus on, and calculates a TMB score, i.e. the number of variants per 1 million bases.

It is recommended that target regions with a coverage lower than 100X are discarded before running the tool. To do so, a workflow including the tools Create Mapping Graph and Identify Graph Threshold Area can be used to generate a target region file only containing target regions with at least 100X coverage (see figure 8.1).

Image tmbprewf
Figure 8.1: Workflow to discard low coverage target regions.

The Calculate TMB Score tool currently considers only SNVs - and discards variants of any other type. First, it filters variants, keeping only variants that lie within exons within ROIs and outside the masking regions. It then applies successively various quality, germline and non-synonymous filters before calculating the TMB score as the a number of somatic variants multiplied by 1 million bases and divided by the length of the Region of Interest (ROI) in megabases (Mb) minus the length of masking regions in megabases (Mb).

To run Calculate TMB Score, go to:

        Toolbox | Biomedical Genomics Analysis (Image biomedical_folder_closed_16_n_p) | Oncology Score Estimation (Image oncology_tools_folder_closed_16_h_p) | Calculate TMB Score (Image calculate_tmb_16_h_p)

The tool takes a variant track as input.

In the next dialog, tracks relevant to the analysis are specified (figure 8.2):

Only variants inside target regions and exons, and not within regions annotated on the masking track, are considered when calculating the TMB score.

Image calculatetmb
Figure 8.2: Specifying tracks and parameters for calculating a TMB status.

In addition, it is possible to enable the calculation of a TMB status based on a low and a high threshold, and which will appear as an additional item on the TMB report. The default values of 10 and 15 respectively have been chosen based on internal benchmark analyses of lung cancer cell lines and different tissue cancer samples. Given the lack of standardization of methods and the heterogeneity of tumor mutation burden across many tumor types, it is difficult to establish cutoff values. Thresholds should be set according to the samples analyzed.

In the next dialog (figure 8.3), it is mandatory to provide a variant database of known germline variants as an input for filtering germline variants.

Image calculatetmb1
Figure 8.3: Specifying tracks and parameters for calculating a TMB score.

The parameters that can be configured are as follow:

Note that TMB filtering parameters are set conservatively. This is because for panels of 1MB size, a single false positive variant may increase the TMB score substantially.

The tool outputs a track of filtered somatic variants, i.e., the variants that remained after the filtering and that were included in the TMB score calculation. However, the main output is a report that includes filtering statistics and the calculated TMB score. It will also include a TMB status if the option was enabled (as shown in figure 8.4). By default, the TMB status is considered low if the TMB score is lower than 10; intermediate if the TMB score is between 10 and 15; and high if the TMB score is larger than 15. It is important to point out again that different cancer types have different somatic mutational load and thresholds should be set according to the samples analysed.

Image calculatetmbreport
Figure 8.4: A TMB report where the option to detect TMB status was enabled with default threshold values.

In addition, the report lists the length of the target regions, counts of various types of variants, and a value describing the tumor mutational burden calculated as the number of mutations per Mb. The quality filters statistics recapitulates how many variants were removed by the various filters applied by the tool, along with the frequency distributions of input and somatic variants.

The TMB score is assessed with a TMB confidence based on the size of the target regions included in the TMB score calculation, i.e., those with a coverage at least 100X: TMB confidence is low if fewer than 900,000bp of target regions have sufficient coverage, high if more than 1,000,000 bp of target regions have been included in the calculation, and intermediate in between these 2 values. Note that report coming from target region files for which low coverage regions were not excluded may wrongly display a high confidence.

This report can be used together with the Combine Reports tool (see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Combine_Reports.html)