The Annotate with DIAMOND tool allows you to annotate a DNA sequence using a set of known protein reference sequences. This tool can be used on sequences without any pre-existing annotations: it is not necessary to annotate the DNA sequences with genes or coding regions. For more information about the DIAMOND aligner, see Annotate CDS with Best DIAMOND Hit.

The tools can be used for various purposes, e.g. transferring annotations from a known reference, annotate the presence of AMR or virulence markers in a genome, or to filter contigs or sequences based on the presence of a set of genes.

For annotating DNA sequences from a set of non-coding reference sequences, the Annotate with BLAST tool may be used instead. However, the Annotate with DIAMOND tool is in general the fastest option when working with coding regions.

If the input sequences are already annotated with CDS annotations, it is also possible to use the Annotate CDS with Best BLAST Hit and Annotate CDS with Best DIAMOND Hit tools - see Annotate CDS with Best BLAST Hit for more information.

To start the analysis, go to:

        Functional Analysis (Image functional_analysis_folder_closed_16_n_p) | Annotate with DIAMOND (Image diamond_annotate_16_n_p)

The first wizard step (figure 14.5), specifies the reference and search parameters.

Figure 14.5: Selecting references and specifying search parameters

The following sources can be used to annotate the input sequences:

As can be seen above, metadata (such as GO terms and taxonomy information) is handled differently depending on the database source:

The search parameters can be modified using the following settings:

The next step (figure 14.6), determines how to handle when multiple overlapping hits are found on the input query sequence.

Figure 14.6: Settings for handling overlapping hits

The following options are available:

Best hits are determined by:

The output options step (figure 14.7), has the following options:

Figure 14.7: Specifying output options

The following sequence output options are available:

The final step controls which outputs are created. Notice, that reports can be aggregated using the Combine Reports tool.