Join alignments
CLC Genomics Workbench can join several alignments into one. This feature can for example be used to construct "supergenes" for phylogenetic inference by joining alignments of several disjoint genes into one spliced alignment. Note, that when alignments are joined, all their annotations are carried over to the new spliced alignment.
Alignments can be joined using the Join Alignments tool, available at:
Tools | Classical Sequence Analysis (
) | Alignments and Trees (
) | Join Alignments (
)
This opens the dialog shown in figure 24.13. Select the alignments that should be joined. In this example, two alignments are selected. The horizontal arrow buttons can be used to select or deselect alignments.
Figure 24.13: Selecting two alignments to be joined.
Clicking on Next opens the dialog shown in figure 24.14.
Figure 24.14: Selecting order of concatenation when joining two alignments.
To adjust the order of concatenation, click on the name of one of the alignments, and move it up or down using the vertical arrow buttons.
The result of the joining is seen in the lower part of figure 24.15.
Figure 24.15: The upper part of the figure shows the two alignments for "Gene A" and "Gene B", respectively. Each alignment consists of sequences from one gene from five different bacterial isolates. The lower part of the figure shows the result of "Join Alignments". The two genes have been joined to an artificial gene fusion, which can be useful for construction of phylogenetic trees in cases where only fractions of a genome is available. Joining of the alignments results in one row for each isolate consisting of two fused genes. Each fused gene sequence corresponds to the number of uniquely named sequences in the joined alignments.
How alignments are joined
Alignments are joined by considering the sequence names in the individual alignments. If two sequences from different alignments have identical names, they are considered to have the same origin and are thus joined. Consider the joining of the alignments shown in figure 24.15 ("Gene A alignment" and "Gene B alignment"). If a sequence with the same name is found in the different alignments (in this case the name of the isolates: Isolate 1, Isolate 2, Isolate 3, Isolate 4, and Isolate 5), a joined alignment will exist for each sequence name. In the joined alignment, the selected alignments will be fused with each other in the order they were selected (in this case the two different genes from the five isolates).
