How to run the Analyze Contigs tool

To run the Analyze Contigs tool:

        Toolbox | Genome Finishing Module (Image finishing_tools_folder) | Analyze Contigs (Image contig_analysis_16_n_p)

This opens the dialog shown in figure 3.1.

Image contig_analysis_commercial_step1
Figure 3.1: Select the contigs to be analyzed.

Select the contigs and click Next. This leads to the Set parameters for contig analysis 1 step shown in figure 3.2.

Image contig_analysis_commercial_step2
Figure 3.2: Set parameters for contig analysis 1.

The parameters to be specified in this step are:

General parameters
  • Minimum length. Specifies the minimum length of annotations. Does not apply to "sudden changes in coverage" and "unaligned ends".
  • Minimum distance to contig ends. Specifies the minimum distance an annotation must have to the contig ends.
  • Ignore scaffold regions. By ticking the box, regions between scaffolded contigs are ignored.
Coverage
  • Detect sudden changes in coverage. A sudden change in coverage in adjacent regions can imply a misassembly.
  • Detect low coverage. Regions with low coverage can indicate a misassembly. Ticking the box allows specification of a threshold value for the minimum number of required overlapping reads.
  • Detect high coverage. Regions with high coverage can indicate a misassembly. Ticking the box allows specification of a threshold value for the maximum number of accepted overlapping reads.
Unaligned read ends
  • Detect unaligned read ends. Unaligned ends of reads can imply a misassembly. Ticking the box allows specification of a threshold value for unaligned ends, which is the maximum percentage of unaligned read ends allowed at a position compared to neighboring positions.
  • Minimum coverage requirement. Specifies the minimum amount of coverage required before checking for unaligned ends.

After adjustment of the parameters, click Next(figure 3.3).

Image contig_analysis_commercial_step3
Figure 3.3: Set the parameters for contig analysis 2.

The parameters to be specified in this step are:

Single stranded coverage When Detect single stranded regionsis checked, regions with single stranded coverage are detected using the specified parameters:
  • Max single stranded percentage specifies the maximum percentage difference between coverage of either strand with the extremes being 0% that allows only the same number of reads in both directions, and 100% that allows all reads to be in one direction. Hence, with a max single stranded percentage of 80%, single stranded regions will be detected when the difference in the number of reads in each direction exceeds 80%.
  • Minimum coverage requirement. Specifies the minimum amount of coverage required before checking for single stranded coverage.
Nonspecific coverage When Detect nonspecific regions is checked, regions with nonspecific coverage (reads with ambiguous mapping) are detected according to the following parameters:
  • Max nonspecific coverage percentage is the allowed percentage of nonspecific coverage. Only regions above this percentage are detected.
  • Minimum coverage percentage is the minimum amount of coverage required before checking for nonspecific coverage.
Broken pairs When Detect broken pairs is checked, regions with broken pairs are detected according to the following parameters:
  • Max broken pairs percentage is the allowed percentage of broken pairs.
  • Minimum coverage requirement Only regions above this value are detected.

The final step shown in figure 3.4 is to specify the Output options and the Result handling.

Image contig_analysis_commercial_output_step
Figure 3.4: Set output parameters for contig analysis.

Click Finish.