Quality scores in the Illumina platform

When using the Illumina importer, you can select the quality score scheme applicable for your data. There are four options:

Further information about the fastq format, including quality score encoding, is available at http://en.wikipedia.org/wiki/FASTQ_format.

Small samples of three kinds of files are shown below. The names of the reads have no influence on the quality score format:

NCBI/Sanger Phred scores:

	@SRR001926.1 FC00002:7:1:111:750 length=36
	TTTTTGTAAGGAGGGGGGTCATCAAAATTTGCAAAA
	+SRR001926.1 FC00002:7:1:111:750 length=36
	IIIIIIIIIIIIIIIIIIIIIIIIIFIIII'IB<IH
	@SRR001926.7 FC00002:7:1:110:453 length=36
	TTATATGGAGGCTTTAAGAGTCATAGGTTGTTCCCC
	+SRR001926.7 FC00002:7:1:110:453 length=36
	IIIIIIIIIII:'III?=IIIIII+&III/3I8F/&

Illumina Pipeline 1.2 and earlier (note the question mark at the end of line 4 - this is one of the values that are unique to the old Illumina pipeline format):

	@SLXA-EAS1_89:1:1:672:654/1
	GCTACGGAATAAAACCAGGAACAACAGACCCAGCA
	+SLXA-EAS1_89:1:1:672:654/1
	cccccccccccccccccccc]c``cVcZccbSYb?
	@SLXA-EAS1_89:1:1:657:649/1
	GCAGAAAATGGGAGTGAAAATCTCCGATGAGCAGC
	+SLXA-EAS1_89:1:1:657:649/1
	ccccccccccbccbccb``cccbcccZcc`^bR^`
The formulas used for converting the special Solexa-scale quality scores to Phred-scale:

$ Q_{phred} = -10 \log_{10} p$

$ Q_{solexa} = -10 \log_{10} \frac{p}{1-p}$

A sample of the quality scores of the Illumina Pipeline 1.3 and 1.4:

	@HWI-E4_9_30WAF:1:1:8:178
	GCCAGCGGCGCAAAATGNCGGCGGCGATGACCTTC
	+HWI-E4_9_30WAF:1:1:8:178
	babaaaa\ababaaaaREXabaaaaaaaaaaaaaa
	@HWI-E4_9_30WAF:1:1:8:1689
	GATGGAGATCTCGACCTNATAGGTGCCCTCATCGG
	+HWI-E4_9_30WAF:1:1:8:1689
	aab`]_aaaaaaaaaa[ER`abaaa\aaaaaaaa[
Note that it is not possible to see from that data itself that it is actually not Illumina Pipeline 1.2 and earlier, since they use the same range of ASCII values.

To learn more about ASCII values, please see http://en.wikipedia.org/wiki/Ascii#ASCII_printable_characters.