SOLiD from Life Technologies

Choosing the SOLiD import will open the dialog shown in figure 6.7.

Image importngsdialog-solid
Figure 6.7: Importing data from SOLiD from Applied Biosystems.

The file format accepted is the csfasta format which is the color space version of fasta format. If you want to import quality scores, a qual files should also be provided. The reads in a csfasta file look like this:

All reads start with a T which specifies the right phasing of the color sequence.

If a reads has a . as you can see in the last read in the example above, it means that the color calling was ambiguous (this would have been an N if we were in base space). In this case, the Workbench simply cuts off the rest of the read, since there is no way to know the right phase of the rest of the colors in the read. If the read starts with a dot, it is not imported. If all reads start with a dot, a warning dialog will be displayed. In the quality file, the equivalent value is -1, and this will also cause the read to be clipped.

When the example above is imported into the Workbench, it looks as shown in figure 6.8.

Image solidimported
Figure 6.8: Importing data from SOLiD from Applied Biosystems. Note that the fourth read is cut off so that the color following the dot are not included

For more information about color space, please see Color space.

In addition to the native csfasta format used by SOLiD, you can also input data in fastq format. This is particularly useful for data downloaded from the Sequence Read Archive at NCBI ( An example of a SOLiD fastq file is shown here with both quality scores and the color space encoding:

@SRR016056.1.1 AMELIA_20071210_2_YorubanCGB_Frag_16bit_2_51_130.1 length=50
+SRR016056.1.1 AMELIA_20071210_2_YorubanCGB_Frag_16bit_2_51_130.1 length=50
@SRR016056.2.1 AMELIA_20071210_2_YorubanCGB_Frag_16bit_2_51_223.1 length=50
+SRR016056.2.1 AMELIA_20071210_2_YorubanCGB_Frag_16bit_2_51_223.1 length=50

For all formats, compressed data in gzip format is also supported (.gz).

The General options to the left are:

Click Next to adjust how to handle the results. We recommend choosing Save in order to save the results directly to a folder, since you probably want to save anyway before proceeding with your analysis. There is an option to put the import data into a separate folder. This can be handy for better organizing subsequent analysis results and for batch processing.