Assemble sequences

This section describes how to assemble a number of sequence reads into a contig without the use of a reference sequence (a known sequence that can be used for comparison with the other sequences, see Assemble to reference sequence). To perform the assembly:

        Toolbox | Sequencing Data Analysis (Image assemblyfolder)| Assemble Sequences (Image assemble)

This will open a dialog where you can select sequences to assemble. If you already selected sequences in the Navigation Area, these will be shown in 'Selected Elements'. You can alter your choice of sequences to assemble, or add others, by using the arrows to move sequences between the Navigation Area and the 'Selected Elements' box. You can also add sequence lists.

Note! You can assemble a maximum of 2000 sequences at a time.

To assemble more sequences, please use the De Novo Assembly (Image assemble) tool under De Novo Sequencing (Image de_novo_sequencing) in the Toolbox instead.

To assemble more sequences, you need the CLC Genomics Workbench (see http://www.clcbio.com/genomics).

When the sequences are selected, click Next. This will show the dialog in figure 31.6

Image assemblestep2
Figure 31.6: Setting assembly parameters.

This dialog gives you the following options for assembly:

Click Next if you wish to adjust how to handle the results. If not, click Finish.

When the assembly process has ended, a number of views will be shown, each containing a contig of two or more sequences that have been matched. If the number of contigs seem too high or low, try again with another Alignment stringency setting. Depending on your choices of output options above, the views will include trace files or only contig sequences. However, the calculation of the contig is carried out the same way, no matter how the contig is displayed.

See View and edit contigs on how to use the resulting contigs.