Generate MSI Baseline

The Generate MSI Baseline tool can be used to generate microsatellite instability (MSI) baseline tracks that are used when running Detect MSI Status. The tool is available from the Tools menu:

        Tools | Biomedical Genomics Analysis (Image biomedical_folder_closed_16_n_p) | Oncology Score Estimation (Image oncology_tools_folder_closed_16_h_p) | Generate MSI Baseline (Image msi_baseline_16_h_p)

The tool can generate a baseline by:

The tool requires at least five read mappings from microsatellite stable (MSS) samples as input. For a reliable baseline we recommend at least 30 samples, since the Detect MSI Status tool uses a Z-test to compare a test sample to the baseline.

The following options can be adjusted (figure 8.7):

Image generate_msi_baseline_wizard
Figure 8.7: Parameters for Generate MSI Baseline.

If Scan target regions or whole genome is selected, the tool first identifies a list of candidate microsatellite loci, which are otherwise provided in the MSI loci track. Subsequently, the tool extracts all reads overlapping the loci and analyzes the locus length by identifying the flanking signatures in the reads. See Detect MSI Status for more details about how the locus length is determined. Finally, the loci are filtered in three steps. A locus is removed if:

Generate MSI Baseline outputs an MSI baseline track and a report summarizing the loci in the baseline (figure 8.8).

Undesired loci can be manually removed from the baseline track by:

Image generate_msi_baseline_report
Figure 8.8: MSI baseline report obtained by scanning target regions for microsatellite sites.

The report contains a summary section with the number of unfiltered and filtered loci. The Total number of loci is either the number of loci in the provided MSI loci track, or the number of loci initially identified by scanning.

The loci table contains all loci in the baseline, with the following columns: