Structural variant detection

The LightSpeed Fastq to Germline Variants and LightSpeed Fastq to Somatic Variants tools can infer tandem duplications and inversions from unaligned ends during the realignment step.

This works by

  1. Identifying breakpoints where multiple reads share a common unaligned end at the same position.
  2. Aligning the sequence of identified breakpoints (unaligned end and upstream sequence) to each other and the reference sequence up or downstream of the breakpoints to find likely matches.

Tandem duplications can be detected from pairs of breakpoints that are within 1000 base pairs of each other. Tandem duplications are reported in the variant track, and those only inferred from unaligned ends are annotated with Yes in the variant track column Inferred from unaligned ends.

When tandem duplications are inferred from unaligned ends, the allele count and coverage is estimated based on the breakpoint with the highest unaligned end count, assuming the following:

In situations where the assumptions are not met, such as for some targeted data where the reads are not evenly distributed, the count and coverage estimates may be inaccurate. The breakpoints used for inference are detected prior to realignment, so the unaligned ends can not necessarily be found in the output read mapping where the alignment of the reads may have changed.

Inversions can be detected from pairs of breakpoint on the same chromosome. The tool only reports the longest possible inversion when multiple breakpoints support similar inversions.

Default inversion detection requires breakpoint support from both sides of the breakpoint, but the option Lenient inversion detection allows detection of inversions where each breakpoint is only supported by reads from one side of the breakpoint. Lenient inversion detection can be relevant when analyzing targeted data. Enabling lenient variant detection can lead to detection of more false positive inversions, and is also likely to increase the processing time. Identified inversions are reported in the inversions track.