Local realignment
Regions where the readmapping is likely to be improved through local realignment are identified and realigned. These are generally regions where reads do not align perfectly, and the imperfect read alignments are unlikely to be caused by sequencing errors. This is for example the case where long unaligned ends (potentially representing long insertions and deletions) are present in the readmapping relative to the reference.
During local realignment, the following steps are performed for each identified region:
- A graph is built, containing nucleotide sequence paths corresponding to all reads as well as the reference. If significant unaligned end breakpoints are found in the original read mapping, the graph construction for that region may involve de-novo assembly of the reads.
- The graph undergoes refinement where paths that are unlikely to contain variants relative to the reference path are removed, and additional variants such as structural variants inferred from indirect evidence may be added.
- Any read that intersects the region of interest is finally realigned against the graph.