Assemble sequences

This section describes how to assemble a number of sequence reads into a contig without the use of a reference sequence (a known sequence that can be used for comparison with the other sequences, see Assemble to reference sequence).

Note! You can assemble a maximum of 10000 sequences at a time.

To assemble more sequences, you need the CLC Genomics Workbench (see http://www.qiagenbioinformatics.com/products/clc-genomics-workbench/).

To perform the assembly:         Toolbox | Sequencing Data Analysis (Image assemblyfolder)| Assemble Sequences (Image assemble)

This will open a dialog where you can select sequences to assemble. If you already selected sequences in the Navigation Area, these will be shown in 'Selected Elements'. You can alter your choice of sequences to assemble, or add others, by using the arrows to move sequences between the Navigation Area and the 'Selected Elements' box. You can also add sequence lists.

When the sequences are selected, click Next. This will show the dialog in figure 18.6

Image assemblestep2
Figure 18.6: Setting assembly parameters.

This dialog gives you the following options for assembly:

Click Next if you wish to adjust how to handle the results. If not, click Finish.

When the assembly process has ended, a number of views will be shown, each containing a contig of two or more sequences that have been matched. If the number of contigs seem too high or low, try again with another Alignment stringency setting. Depending on your choices of output options above, the views will include trace files or only contig sequences. However, the calculation of the contig is carried out the same way, no matter how the contig is displayed.

See View and edit contigs on how to use the resulting contigs.