Detect Regional Ploidy
Detect Regional Ploidy takes as input a target-level CNV track (
) and a somatic variant track (
), and predicts ploidy states including loss-of-heterozygosity (LOH).
The tool jointly optimizes the sample purity10.1 together with a Hidden Markov Model (HMM). It then predicts the most likely ploidy state for each locus: a CNV target or somatic SNP assumed to be heterozygous in normal cells. The ploidy states are predicted based on the relative log ratio (RLR) for CNV targets and the B-allele frequency (BAF) for SNPs. Neighboring loci that share the ploidy state are subsequently merged into contiguous regions, which are assigned a loss-of-heterozygosity (LOH) status.
To run the tool, go to:
Tools | Resequencing Analysis (
) | Variant Detection (
) | Detect Regional Ploidy (
)
The following options can be configured (figure 10.6):
- Somatic variants. A variant track (
) with somatic variants detected from the same sample as the CNVs.
- Known variants. A variant track (
) including SNPs assumed to be heterozygous in normal cells:
- A SNP database (e.g., dbSNP common), if Variant database is selected.
The variants must be annotated with population-level frequencies using at least one of the following annotations: alleleFreqs, CAF, TOPMED, or FREQ.
To speed up the analysis, we recommend filtering the database to the target regions using Filter Based on Overlap.
- Germline variants detected from a matched normal sample, if Germline variants is selected.
- A SNP database (e.g., dbSNP common), if Variant database is selected.
- Minimum sample purity. The lowest purity allowed during optimization. Note that it can be difficult to determine whether a sample has very low purity or contains only a few CNV events.
- Transition factor. A parameter used by the HMM to control changes in ploidy states between neighboring loci. Higher values reduce the probability of a state change, which increases with the distance between loci. With the default value of 100, two loci 1 Mb apart have roughly a 98% chance of sharing the same state.
- Decoding method. The HMM method used to predict ploidy states, either Viterbi or Posterior.
- Normalize coverage. When checked, coverage is normalized using a factor derived from B-allele frequencies.
- Minimum factor. The minimum value the normalization factor can have.
- Maximum factor. The maximum value the normalization factor can have.
Checking this option is recommended for small panels, where many targets may be affected by CNV events. The normalization factor range may need adjustment for such samples.
- Centromeres. An annotation track (
) containing the centromeric regions. Loci overlapping these regions are discarded.
- Remove outliers. When checked, CNV targets with extreme RLRs and variants with extreme B-allele frequencie are discarded before they are merged into regions.
- Minimum loci count. Regions originating from fewer loci than this are discarded.
- Maximum distance (Mb). Regions are extended to the chromosome start/end, the centromere, or a neighboring region when the distance (in megabases) is smaller than this.
- Minimum length (Mb). Regions shorter than this (in megabases) are discarded.
Figure 10.6: Options for Detect Regional Ploidy.
Footnotes
- ... purity10.1
- The proportion of cells in the sample that are tumor-derived.
Subsections
