Map Reads to Reference

Read mapping is a very fundamental step in most applications of high-throughput sequencing data. The CLC Genomics Workbench includes read mapping in several other tools (such as in the Map Reads to Contigs tool, or for RNA-Seq Analysis), but this chapter will focus on the core read mapping algorithm. At the end of the chapter you can find descriptions of the read mapping reports and a tool to merge read mappings.

There are two different versions of the core mapper: one for color space data, and one for base space data. At you can find white papers with detailed benchmarks and descriptions of both algorithms.

In addition, the mapper has been improved to work with PacBio reads and reads longer than 500bp. Before the Map Reads to Reference tool starts to map the reads, it checks the input sequence list(s) to decide on the mapping algorithm to use:

It is possible to mix sequence list that have the read group "PacBio" with sequence lists that have a different read group for the same mapping. In this case the appropriate mapping algorithm will be applied to each of the sequence list.

In contrast it is not possible to mix color space and base space data.