Algorithmic details
- Generating a consensus: The consensus of the unaligned ends is calculated by simple alignment without gaps. Having created the consensus, we exclude the unaligned ends which differ by more than 20% from the consensus, and recalculate the consensus. This prevents 'spuriously' unaligned ends that extend longer than other unaligned ends from impacting the tail of the consensus unaligned end.
- Mapping of the consensus:
- 'Cross mapping': When mapping the consensus sequences against the reference sequence around other breakpoints we require that:
- The consensus is at least 16 bp long.
- The score of the alignment is at least 70% of the maximal possible score of the alignment.
- 'Aligning': When aligning the consensus sequences two closely located breakpoints against each other we require that:
- The breakpoints are within a 100 bp distance of each other.
- The overlap in the alignment of the consensus sequences is least 4 nucleotides long.
- 'Self-mapping': When mapping the consensus sequences of breakpoints against the reference sequence in a region around the breakpoint itself we require that:
- The consensus is at least 9 bp long.
- A match is found within 400 bp window of the breakpoint.
- The score of the alignment is at least 90% of the maximal possible score of the alignment of the part of the consensus sequence that does not include the variant allele part.
- 'Cross mapping': When mapping the consensus sequences against the reference sequence around other breakpoints we require that: