Identify MLST Scheme from Genomes

This section describes how to perform the identification of the relevant MLST Scheme for a genome sequence or list of genome sequences.

This tool can be used before running the Type With MLST Scheme tool in case you are working with a sample containing a single or multiple unknown species, as in the Type among Multiple Species workflow.

To run the Identify MLST Scheme from Genomes tool choose:

        Toolbox | Microbial Genomics Module (Image mgm_folder_closed_flat_16_h_p) | Typing and Epidemiology (Image typing_epi_folder_closed_16_h_p) | MLST Typing (Image large_mlst_open_16_h_p) | Identify MLST Scheme from Genomes (Image id_large_mlst_16_n_p)

The input to the tool is a sequence, or a sequence list (figure 10.19).

Image mlst_identify_step1
Figure 10.19: Select relevant genome sequence or sequence list.

The next step is to select as many MLST schemes as necessary to identify the species present in the input sample (figure 10.20).

Image mlst_identify_step2
Figure 10.20: Select relevant MLST scheme(s) to search among.

To identify the best matching scheme, the tool identifies the 10 most prevalent loci, i.e. loci that occur in most or all of the sequence types. If fewer loci are available, the tool will base the identification on these, thus the tool also works for classic 7-gene MLST schemes, given that they are in the MLST Scheme format.

The k-mers for all alleles for these most prevalent loci are then determined, and the provided references are checked for their presence.

The output of this tool is the MLST scheme that best matches the sequences analyzed. To add the obtained best match to a Result Metadata Table, see Extend Result Metadata Table.

The tool will not produce an output if no scheme could be uniquely identified.