ClustalW

The ClustalW alignment method was in the mid nineties improved over previous progressive alignment methods [Thompson et al., 1994] (see http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7984417). Even though this alignment method is more than ten years old it is still a highly popular alignment method and has become the method of choice to many researchers. One of the reasons for the popularity is the availability for most computer platforms and the easy integration on websites.

There is still one problem for ClustalW as for all other stand alone alignment programs. None of them can visualize annotations on the aligned sequences. Most researches align their own sequence to some reference sequences, which are often retrieved from one or more of the large public databases, e.g. GenBank. As input these programs require a Fasta file and they typically export in their own alignment format. The ClustalW program, when used in a CLC Workbench, uses ClustalW for the alignment calculation but still retains annotations on the sequences when the alignment is displayed (see figure 2.4).

One of the advantages of ClustalW is that it uses very little computer memory on rather large sequences so you can align large sequences without having a state-of-the-art computer.

ClustalW parameter settings

ClustalW has a single parameter to set: Guide Tree Algorithm: Select Normal or Quick. Quick uses a fast but not as accurate algorithm for the alignment guide tree.

Image externalalignment_ClustalW_step2
Figure 2.3: Setting ClustalW alignment parameters

The output of the ClustalW aligment can be seen in figure 2.4. In addition to the alignment output file, a phylogenetic tree file is also generated.

Image clustal_output-web
Figure 2.4: Output of ClustalW alignment. Annotations from the original sequence is retained on the output, and is shown when setting the Annotation layout to Show annotation.