Create Trees
For a given set of aligned sequences (see Create an alignment) it is possible to infer their evolutionary relationships. In CLC Genomics Workbench this may be done using one of two distance based methods (see Bioinformatics explained).
For a given set of aligned sequences (see Create an alignment) it is possible to infer their evolutionary relationships. In CLC Genomics Workbench this may be done either by using a distance based method or by using maximum likelihood (ML) estimation, which is a statistical approach (see Bioinformatics explained). Both approaches generate a phylogenetic tree.
Three tools are available for generating phylogenetic trees:
- K-mer Based Tree Construction () Is a distance-based method that can create trees based on multiple single sequences. K-mers are used to compute distance matrices for distance-based phylogenetic reconstruction tools such as neighbor joining and UPGMA (see Distance-based methods). This method is less precise than the "Create Tree" tool but it can cope with a very large number of long sequences as it does not require a multiple alignment. The k-mer based tree construction tool is especially useful for whole genome phylogenetic reconstruction where the genomes are closely releated, i.e. they differ mainly by SNPs and contain no or few structural variations.
- Maximum Likelihood Phylogeny () The most advanced and time consuming method of the three mentioned. The maximum likelihood tree estimation is performed under the assumption of one of five substitution models: the Jukes-Cantor, the Kimura 80, the HKY and the GTR (also known as the REV model) models (see Maximum Likelihood Phylogeny for further information about the models). Prior to using the "Maximum Likelihood Phylogeny" tool for creating a phylogenetic tree it is recommended to run the Model Testing tool in order to identify the best suitable models for creating a tree.
- Create Tree () Is a tool that uses distance estimates computed from multiple alignments to create trees. The user can select whether to use Jukes-Cantor distance correction or Kimura distance correction (Kimura 80 for nucleotides/Kimura protein for proteins) in combination with either the neighbor joining or UPGMA method (see Distance-based methods).
Subsections
- K-mer Based Tree Construction
- Create tree
- Model Testing
- Maximum Likelihood Phylogeny
- Bioinformatics explained