Limitations
Data
LightSpeed is developed for and has been optimized on Illumina paired-end short read sequencing data. Paired-end sequencing data from other platforms utilizing the same data structure and similar read lengths can be expected to perform equally well with LightSpeed unless the background error-rate is markedly different. Analysis of other types of sequencing reads may not result in similar processing times or variant calls of an equivalent quality. Reads that are longer than 800 base pairs cannot be processed.
Variant detection
The germline variant detection algorithm in LightSpeed is based on a model expecting diploid genomes. Therefore, LightSpeed cannot be expected to accurately detect germline variants in genomes with other ploidies. In addition, alternate ploidies of sex chromosomes are not considered in the variant detection algorithm.Somatic variant detection with LightSpeed is possible for variants down to a variant allele frequency of 0.1%. Variants below this frequency will not be considered. However, in order to ensure high accuracy in variant calling, we recommend only calling variants down to a variant allele frequency of approx. 1%.
Reference sequence
LightSpeed considers all chromosomes to be linear. Hence, for read mapping, circular chromosomes are linearized with position 1 starting at the junction of the chromosome. No reads will be mapped accross the junction of circular chromosomes.
UMI grouping
- The maximum number of reads used for creating a UMI consensus read is 100,000. Therefore, UMI groups with more than 100,000 reads will be merged into more than one consensus UMI read.
- LightSpeed UMI grouping requires that reads have similar mapping positions. In data from single primer extension protocols, such as many primer based QIAseq protocols, read pairs representing the same DNA fragment with the same UMI sequence can originate from different primers. This can happen if primers in the same direction are located near each other, making it possible for a downstream primer to amplify a PCR product generated from an upstream primer. LightSpeed will not group reads originating from different primers.
- When UMIs are used to group reads, the sequence is compared base by base. If an insertion or deletion is present in the beginning of a UMI sequence, this will likely prevent the reads from being grouped because all bases after the variant will be mismatches.