QIAGEN Bioinformatics Manuals

Paired data

For paired reads, we check whether the sequences in different pairs mapping in a given region have their first 10 base pairs identical. If they do, they are marked as potential duplicate reads. (We have found this route works better than a statistical route in the case of paired data.) We then check our proposed list of duplicates to look for false positives. We also do a check for closely related variants of those sequences we now believe to represent duplicate reads.

The algorithm also takes sequencing errors into account when filtering out paired data.

Browse the manual

Paired data