Read Mapping
There are two programs within the CLC Assembly Cell for mapping reads to a reference sequence, or reference sequences: clc_mapperfor mapping in base space and
clc_mapper_legacyfor mapping in color space.
The aim of both programs is the same: to map reads to the area of a reference sequence that they are likely to have originated from. In both cases, the alignment quality threshold is given as a certain fraction of the read that must match in a certain fraction of its positions. E.g., the threshold may be set at 90% identity over 50% of the read length. A gapped alignment is always performed.
By default, read mapping is done with local alignment of reads to a set of reference sequences. The advantage of performing local alignment rather than global alignment is that the ends are automatically removed if there are sufficiently many sequencing errors in those regions. This can also be beneficial if the ends of the reads contain vector contamination or adapter sequences.
An option exists to run global alignment instead of local alignment if this is desired.
In cases where memory consumption is an issue the
clc_mapper_legacycan be used for base space mapping as it has a scalable memory consumption. However, we recommend that the
clc_mapperis used for base space mapping when possible as it has better performance in terms of both quality and speed.
Subsections
- Overview of base space mapping
- Circular references
- Saving and re-using reference index files
- Overview of color space mapping
- General information for both read mappers
- Running Read Mapping Analyses
- Mixed base space and color space mappings