Specifying information for the assembly

Options in this section can alter the results of the assembly. The How it works section of the manual gives further details that are relevant to how these settings may affect an assembly.

-w <n> / --wordsize <n>

Set n to be the word size for the de Bruijn graph. The default is based on the size of the input as described in the How it works section of the manual.

-b <n> / --bubblesize <n>

Set the maximum bubble size for the de Bruijn graph. The default is 50 bases.

-e <file> / --estimatedistances <file>

This setting estimates the distances for paired reads as observed within unscaffolded contigs. These distances are then used in the scaffolding step. If multiple sets of paired data have been input, the distances are estimated separately for each data set. The distances calculated will be saved to the file specified as the argument to this parameter.

When this flag is used, the program will aim to identify tight distance intervals from areas containing a substantial number of the mapped reads for each dataset.

There are situations where it is not possible to estimate accurate paired distances from the data, such as:

If it is not possible to estimate an accurate distance from the data for any particular paired read set, then the original paired distance entered as part of the parameter settings associated with the -p flag will be used. Errors and warnings associated with such situations will be written to the file specified with the -e parameter.