The Correct PacBio Reads tool should be used as a preprocessing step prior to assembly of SMRT sequencing reads with high error-rates with the De Novo Assemble PacBio reads tool to increase the quality and thereby obtain a better assembly. Both tools are designed for assembly of microbial genomes and small Eukaryotic genomes (for example C. elegans).
SMRT sequencing technologies, as implemented by Pacific BiosciencesTM, have the potential to vastly improve the completeness of genome sequence assemblies, as read lengths often exceed the length of most repeats in the genome. A major obstacle is the high (10-15%) rate of sequencing errors in SMRT reads. A second obstacle is the presence of chimeric reads and sequences derived from untrimmed adapters, which can be hard to recognize given the rate of errors and truncations. However, because sequencing errors are mostly random and reads are randomly sampled across the genome, it is possible to i) correct SMRT sequencing reads if coverage is sufficiently high with the Correct PacBio Reads tool and ii) assemble the error-corrected reads into high-quality contigs with the De Novo Assemble PacBio Reads tool. Note that it is not necessary to correct PacBio reads when using these with the Join contigs tool with the "Use long reads" option selected. The error correction of PacBio reads is required only when one is performing de novo assembly using long reads.
The Correct PacBio Reads tool takes raw PacBio reads as input and produces error-corrected reads as output. The overall strategy for correcting PacBio reads consists of the following four steps:
- Partition the reads into (long) seed reads and (shorter) correction reads.
- Map all correction reads to all seed reads.
- Detect and handle hairpin sequences (untrimmed adapters) and chimeras in the seed reads.
- For each seed read, compute a consensus sequence and output this sequence as a corrected read.
The longest reads are selected as seed reads, because they give the assembler most information to resolve large repeats.