Create Methylation Level Heat Map

The Create Methylation Level Heat Map tool generates a two dimensional heat map of methylation levels. Each column corresponds to a sample, and each row corresponds to a feature (a single CpG site or a larger target region including multiple CpG sites, e.g. promoter regions). A hierarchical clustering of the samples is performed. For up to 5000 features, a hierarchical clustering of features is also performed.

Calculation of the methylation levels is performed across all CpG sites in a given target. When the coverage of a CpG site is lower than a specified threshold, that site will be considered zero methylated, indicating that it is uninformative. For targets containing multiple CpG sites, only informative sites are considered and the methylation level is averaged across all the informative sites. For targets containing only a single CpG site, the methylation level is considered only for that site.

Clustering of features and samples

Features are clustered according to the similarity of their methylation level profiles over the set of samples. Samples are clustered according to the similarity of their methylation level patterns over the set of features.

The clustering has a tree structure that is generated by:

  1. Letting each feature or sample be a cluster.
  2. Calculating pairwise distances between all clusters.
  3. Joining the two closest clusters into one new cluster.
  4. Iterating 2 to 3 times, until a single cluster, containing all the features or samples, remains.

The tree is drawn such that the distances between clusters are reflected by the lengths of the branches in the tree.

Running the tool

Go to:

        Toolbox | Epigenomics Analysis (Image epigenomics) | Bisulfite Sequencing (Image bisulfite_folder_closed_16_n_p) | Create Methylation Level Heat Map (Image heatmap)

The tool takes as input methylation level tracks (Image annotation_track_16_n_p) generated using the Call Methylation Levels tool with the "Report unmethylated cytosines" option selected, as shown in figure 11.23. This option is enabled by default when running the Detect QIAseq Methylation template workflow.

For valid comparisons to be made across samples, the inputs must have been generated using the same reference information, i.e. the same reference genome, target regions, etc.

Image methylation_call_settings
Figure 11.23: The "Report unmethylated cytosines" option in Call Methylation Levels should be enabled when generating methylation level tracks for use with Create Methylation Level Heat Map.

In the wizard step shown in figure 11.24, select the target region track containing the CpG sites. These may be single CpGs or larger targets (e.g. promoter regions). If no target region track is selected, single CpG sites from the methylation level tracks are used as features in the heat map.

At the bottom of this step, specify the minimum CpG site coverage value. CpG sites with coverage below this will be excluded from the analysis. By default, the value is 30. When only single CpG sites are analyzed, the methylation level of low coverage sites is set to 0.

A distance measure and a cluster linkage method for the hierarchical clustering is also specified here. The distance measure specifies how distances between two features or samples should be calculated. The cluster linkage method specifies how the distance between two clusters, each consisting of a number of features or samples, should be calculated.

Image methyl_heatmap_parameter
Figure 11.24: The core options for Create Methylation Level Heat Map.

There are three kinds of Distance measures:

The possible cluster linkages are:

Filtering options are specified in the next step, as shown in figure 11.25.

Image methylation_heatmap_filtering
Figure 11.25: The features to include in results can be customized using filtering options.

The Filter settings options are described below. Some require additional information be provided in the sections underneath.

Create Methylation Level Heat Map generates two outputs: a heat map and a methylation expression track.

The methylation level heat map

Each row in the heat map corresponds to a feature (target region or single CpG site). Each column corresponds to a sample. The color in the $ i$'th row and $ j$'th column reflects the methylation level of feature $ i$ in sample $ j$. The color scale can be set in the side panel settings. Heat map settings are described further at:

Image methyl_heatmap_output
Figure 11.26: A methylation level heat map.

The methylation expression track

The methylation expression track includes information from all the samples provided as input. It can be viewed as a graphical track (Image annotation_track_16_n_p) or as a table (Image table), as shown in figure 11.27.

The following information is available for each feature:

The following four columns are provided for each sample, with the relevant sample name appended to the column name.

Viewing selected features

The heat map and methylation expression track created by Create Methylation Level Heat Map are linked. Selected elements in one of these outputs can be highlighted in the other. Open both outputs, preferably in a split view, and then:

The selections made in one of the outputs will now be selected in the other.

Image methyl_table_options
Figure 11.27: Methylation level results shown in a table view. After selecting rows in the table, the buttons highlighted can be used to work with the selection in various ways.

Viewing results in context using a track list

Methylation expression tracks can be included in a track list with other relevant tracks, such as read mapping and annotation tracks, as shown in figure 11.28.

Further details about working with track lists can be found at:

Image methyl_heatmap_tracklist
Figure 11.28: Methylation results in the context of a track list, with the table view of the methylation expression track open in the bottom of the split view.