Known limitations

clc_remove_duplicates has a limitation when there are duplicate reads representing several alleles. The algorithm will identify if there are duplicate reads to be removed, but it is not able to distinguish between sequencing errors and true variation in the reads. So if you have a heterozygous SNP in such an area, you risk that only one of the alleles will be represented in the data after running this tool.