Create Heat Map for Abundance Table

The hierarchical clustering groups features by the similarity of their genomes over the set of samples, and clusters samples by the similarity of genomes over their features. Each clustering has a tree structure that is generated as follow:

  1. The tool considers each feature or sample to be a cluster.
  2. It calculates pairwise distances between all clusters, and join the two closest clusters into one new cluster.
  3. This process is repeated until there is only one cluster left, which contains all the features or samples.
  4. The tree is then drawn so that the distances between clusters are reflected by the lengths of the branches in the tree.

The Create Heat Map for Abundance Table tool uses the TMM normalization described in (described in http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=RNA_seq.html) to make samples comparable, then does a z-score normalization to make features comparable.

To create a heat map:

        Metagenomics (Image wma_folder_open_flat_16_n_p) | Abundance Analysis (Image abundance_folder_closed_16_n_p) | Create Heat Map for Abundance Table (Image heatmap_16_n_p)

Select an abundance table with more than one sample as input (i.e., an OTU table table, or a merged functional or profiling table) and specify a distance measure and a cluster linkage (figure 7.12). The distance measure is used to specify how distances between two features or samples should be calculated. The cluster linkage specifies how the distance between two clusters, each consisting of a number of features or samples, should be calculated. Learn more about how distances and clusters are calculated at http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Clustering_features_samples.html.

Image heatmapotu2
Figure 7.12: Select an abundance table.

After having selected the distance measure, set up the feature filtering options (figure 7.13).

Image heatmapotu1
Figure 7.13: Set filtering options.

Indeed, genomes usually contain too many features to allow for a meaningful visualization. Clustering hundreds of thousands of features is also very time consuming. Therefore it is recommend to reduce the number of features before clustering and visualization. There are several different filter settings:

The tool generates a heat map showing the abundance of each feature in each sample and showing the sample clustering and/or feature clustering as a binary tree over the samples and features, respectively (figure 7.14).

Image heatmapotu3
Figure 7.14: Heat map.

In order to create a heat map with a specific taxonomic level information, it is possible to use the option "Aggregate feature" in the right hand side panel of the Abundance table. When aggregating an abundance table, by class for example , a new column called "Class (Aggregated)" containing the class names is now created. This name will then be used when creating a Heat Map. This is done in order to avoid very long feature names in abundance tables and downstream analysis tools.