De Novo Assemble Metagenome
Before assembly, adapters should be removed from sequences. This can be done using Trim Reads (http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Trim_Reads.html). The presence of adapters can result in the assembler trying to join regions that are not biologically relevant, leading to an assembly taking a long time and yielding misleading results.
Quality trimming before assembly is not generally necessary as the assembler itself should weed out or correct bad quality regions. However, trimming of low quality regions may decrease the amount of memory needed for the de novo assembly, which can be an advantage when working with large datasets.
To run the De Novo Assemble Metagenome tool:
Toolbox | Microbial Genomics Module () | Metagenomics () | De Novo Assemble Metagenome ()
Select the sequence lists or single sequences to assemble.
Set assembly parameters (figure 4.1).
- Minimum contig length Contigs below this length will not be reported. For very complex datasets containing reads from many closely related species, the assembler will often produce shorter contigs. For such cases, it is recommended to set a lower threshold in order to cover a larger proportion of the metagenome with contigs. Reversely, for metagenomes of low complexity, it is often wise to set a higher threshold in order to avoid duplication.
- Execution mode
- Fast The assembler is iterated once with a predifined wordsize ().
- Longer contigs the assembler is iterated three times with increasing wordsize ( ), using the contigs from the previous iteration as input in the next iteration together with the input reads.
- Perform scaffolding If selected, as the last step of the assembly process the assembler attempts to join contigs using paired-end information. Since paired-end information is needed to perform scaffolding, this option is disabled for single-end sequences.
Figure 4.1: Setting parameters for the assembly.
Select Create report to create a summary report containing statistics on input reads and output contigs.
The De Novo Assemble Metagenome output
The De Novo Assemble Metagenome tool will output a list of contigs. If Perform scaffolding was selected, scaffolds will appear at the bottom of the contig list as scaffold_1, scaffold_2, etc.