Error-correction report

In the last dialog of the wizard, you can choose to create a report of the results (see figure 14.5).

Image error_correction_report_lowq
Figure 14.5: The error-correction report is useful for evaluating the quality of the input data and the performance of the error-correction.

The report contains the following information for the input reads and the corrected reads:

Nucleotide distribution:
Fraction of the reads covered by each nucleotide, A, C, G and T.
Count:
The total number of reads.
Minimum, maximum, average, N50 and N90:
Read length statistics.
Total:
The total number of bases.
Read length distribution:
A graph showing the number of contigs of different lengths.

In addition to this, some statistics about the error correction are given:

Seed read length threshold
The length of the shortest seed read used as seed read - picked according to the Coverage percentage of reads to correct (see above).
Average correction coverage
The average coverage by correction reads on seed reads.
Hairpin splits
The number of splits performed due to putative untrimmed hairpin adapter sequences.
Chimeric splits
The number of splits performed due to putative chimeras.
Mismatches corrected
The number of mismatches that have been corrected in the output reads.
Insertions corrected
The number of insertions that have been corrected in the output reads.
Deletions corrected
The number of deletions that have been corrected in the output reads.
Errors per 100kb trimmed input read
The total number of errors (mismatches, insertions and deletions) that have been corrected per 100kb in the output reads.

Finally, the number and total size of the following elements are given: