Detect Regional Ploidy

The Detect Regional Ploidy tool is designed to detect regional ploidy levels including loss-of-heterozygosity (LOH) from targeted research resequencing experiments.

The tool takes a target-level CNV events annotation track (from a CNV tool), somatic variants, and either germline variants or known segregating variants and optionally centromers.

To run the Detect Regional Ploidy tool, go to:

        Toolbox | Resequencing Analysis (Image resequencing) | Variant Detection (Image variant_detection_folder_closed_16_h_p) | Detect Regional Ploidy (Image detect_regional_ploidy_16_n_p)

Select the CNV target-level annotation track generated by a CNV tool and click Next.

You are now presented with choices regarding LOH detection.

Regional ploidy estimation

The algorithm implemented in the Detect Regional Ploidy tool is inspired by the following paper:

Based on coverage ratios and the allele ratios of putative heterozygous germline variants the tool detects targets and regions affected by Loss-of-heterozygosity events. The tool can handle both matched tumor normal data and unpaired tumor data. In both cases variants that are assumed to be heterozygous in normal tissue has to be identified.

Tumor-normal pairs: For matched tumor normal data, a track with somatic variants and a track with germline variants will be used. The variants used to detect LOH are simply the somatic variants overlapping heterozygous germline variants.

Tumor only: For unpaired tumor data, a somatic variant track and a database of known segregating variants are used (typically dbSNP common). The variants used in LOH calculation are the somatic variants overlapping the variants in the database.

The model operates with a number of ploidy states, which are characterized by their numbers of parental and maternal alleles (Table 11.1). The state together with the tumor purity (the percentage of cells in the sample originating from the tumor) determines the expected coverage ratio and the expected allele frequencies of the heterozygous variants. As an example, if a normal diploid sample would yield 200 reads, then a sample with purity 50% and copy-number 1 (deletion) would yield 150 reads (50%*200+50%*100). That means the coverage ratio is 150/200 = 75%. Table 11.2 shows the expected coverage ratios for different states and purities.

The state together with tumor purity also determines the expected allele frequencies of heterozygous variants. As an example, consider a sample with 60% purity where the cancer cells contain a deletion in a region with two alleles, A and B. If we take 100 cells:

In total there will be 100 copies of allele A, and 40 copies of B. And the frequency of A will be 100 / (100 + 40) = 71.4%.

The tool estimates the purity using a hidden Markov model (HMM), that is then used to predict the most probable state for each target.


Table 11.1: Ploidy states with their allele ratio, total copy number and whether the state is considered loss-of-heterozygosity.
State Allele-ratio Copy-number Loss-of-heterozygosity
Bi-allelic deletion 0:0 0  
Deletion 0:1 1 deletion LOH
Diploid 1:1 2  
Uniparental disomy 0:2 2 copy-neutral LOH
Duplication 1:2 3  
WGD 2:2 4  



Table 11.2: The expected coverage levels given tumor purity and the ploidy state.
Purity Bi-allelic deletion Deletion Diploid Uniparental disomy Duplication WGD
10.0% 90.0% 95.0% 100.0% 100.0% 105.0% 110.0%
20.0% 80.0% 90.0% 100.0% 100.0% 110.0% 120.0%
30.0% 70.0% 85.0% 100.0% 100.0% 115.0% 130.0%
40.0% 60.0% 80.0% 100.0% 100.0% 120.0% 140.0%
50.0% 50.0% 75.0% 100.0% 100.0% 125.0% 150.0%
60.0% 40.0% 70.0% 100.0% 100.0% 130.0% 160.0%
70.0% 30.0% 65.0% 100.0% 100.0% 135.0% 170.0%
80.0% 20.0% 60.0% 100.0% 100.0% 140.0% 180.0%
90.0% 10.0% 55.0% 100.0% 100.0% 145.0% 190.0%
100.0% 0.0% 50.0% 100.0% 100.0% 150.0% 200.0%



Table 11.3: The expected frequencies of variants that are heterozygous in the normal tissue given tumor purity and the ploidy state.
Purity Bi-allelic deletion Deletion Diploid Uniparental disomy Duplication WGD
10.0% 50.0% 52.6% 50.0% 55.0% 52.4% 50.0%
20.0% 50.0% 55.6% 50.0% 60.0% 54.5% 50.0%
30.0% 50.0% 58.8% 50.0% 65.0% 56.5% 50.0%
40.0% 50.0% 62.5% 50.0% 70.0% 58.3% 50.0%
50.0% 50.0% 66.7% 50.0% 75.0% 60.0% 50.0%
60.0% 50.0% 71.4% 50.0% 80.0% 61.5% 50.0%
70.0% 50.0% 76.9% 50.0% 85.0% 63.0% 50.0%
80.0% 50.0% 83.3% 50.0% 90.0% 64.3% 50.0%
90.0% 50.0% 90.9% 50.0% 95.0% 65.5% 50.0%
100.0%   100.0% 50.0% 100.0% 66.7% 50.0%


Limitations

Detect Regional Ploidy is designed for ploidy estimation on autosomal chromosomes. The underlying model does not take into account that the normal state of sex chromosomes in male samples is haploid, and hence may mis-interpret detected allele frequencies and coverage ratios. If the tool is used to estimate ploidy for sex chromosomes, the results should be carefully assessed.



Subsections