Large MLST Scheme Visualization and Management
A Large MLST Scheme contains information about:
- The loci that define the regions of interest.
- For each locus, a list of known alleles.
- A list of sequence types, where each sequence type is described by the alleles present at each locus (the profile of the sequence type).
The Large MLST Scheme has several views. Switching between the views of the scheme is done by clicking the buttons at the lower-left corner of the view.
Figure 14.1: The Large MLST Scheme Heat Map view.
The heat map view shows an overview of the scheme (figure 14.1), with the sequence types on the vertical axis, and the loci on the horizontal axis. Each cell in the heat map is colored according to the frequency of the allele in the given locus, that is, a value of 0.9 means 90% of the sequence types have this particular allele. Missing alleles will have a value of zero, alleles not present in any sequence type are not represented by the heat map view. The heat map can optionally be clustered based on the allele frequency. The clustering settings can be specified at scheme creation time, but it is also possible to use the Recluster large MLST scheme button to update the clustering.
By right-clicking on the heat map, it is possible to either select sequence types or loci in other views or copy sequence or loci names to the clipboard.
Figure 14.2: The Allele table view.
The allele table view (figure 14.2) has an upper table that lists the loci in the scheme. The table contains the following columns:
- Locus: the name of the locus
- Locus category: shows any virulence or resistance-gene related annotations.
- Number of alleles: the total number of alleles for this locus. Not all alleles may be part of a sequence type.
- Percentage of sequence types: shows how many of the sequence types have an allele in the given locus. For a strict core genome scheme, all of the sequence types contain all loci.
The lower table lists the alleles for the selected loci. It has the following columns:
- Allele name: the name of the allele.
- Sequence length: length in nucleotides.
- Creation date: when the allele was added.
- Gene info: AMR or virulence related information.
- Sequence types: the sequence types that contain this allele.
It is possible to Align Selected Alleles, which creates a new multiple sequence alignment view or to Extract Selected Alleles, which creates a sequence list with the alleles.
Figure 14.3: The Sequence Type table.
The Sequence Type table view (figure 14.3) shows the sequence types in the scheme. It always contains the following columns:
- ST: the name of the sequence type
- Number of loci: the number of loci, that are defined for this sequence type. Strict core genome schemes and classic 7-gene schemes will have the same number of loci for all sequence types.
Several other columns with arbitrary metadata information may be present as well.
At the bottom of the view, two buttons make it possible to Select Sequence Types in Other Views and to Create Large Sub Scheme.
Figure 14.4: The Create Large MLST Subscheme options.
The Create Large Sub Scheme has the same options (figure 14.4) as the other scheme creation tools, except for some additional options for pruning the scheme:
- Locus fractional presence: specifies the fraction of sequence types that a given locus must contain before it is added to the new scheme. For instance, a value of 0.95 would mean that the resulting scheme only contains loci present in at least 95% of the selected sequence types (a loose core genome scheme).
- Keep all alleles: if this option is deselected, only alleles that are part of at least one sequence type are retained.
Finally, the Large MLST Scheme also has a Minimum Spanning Tree view, which is the topic of the next section.