Create Expression Plot

Create Expression Plot generates visualizations of gene expressions for a small number of genes. It uses expressions from an input Expression Matrix (Image expression_matrix_track_16_n_p) and groupings provided by Cell Clusters (Image cell_clusters_16_n_p) or Cell Annotations (Image cell_annotations_16_n_p).

The tool can output:

It is often most natural to run the tool from a Dimensionality Reduction Plot, by right-clicking on the plot. However, it can also be found in the Toolbox here:

        Expression Analysis (Image sc_expression_folder_open_16_n_p) | Create Expression Plot (Image sc_expression_plot_16_n_p)

The first set of options control how cells are grouped. The groupings are shown at the top of the Heat Map, form the columns of the Dot Plot and define groups in the Violin Plot. These options are:

The genes in the output Heat Map or Dot Plot are clustered such that genes with similar expression patterns are found on adjacent rows. The clustering has a tree structure that is generated by

  1. Letting each feature or sample be a cluster.
  2. Calculating pairwise distances between all clusters.
  3. Joining the two closest clusters into one new cluster.
  4. Iterating 2-3 until there is only one cluster left (which contains all the genes).

In the Heat Map, the clustering is drawn as a tree where distances between clusters are reflected by the lengths of the branches in the tree.

The above algorithm requires a distance measure and a `linkage' that describes how to apply the distance measure to clusters.

There are three kinds of Distance measures:

The possible cluster linkages are:

There are usually too many cells for all of them to be viewed in a Heat Map on a standard computer display. Max cells in heat map constructs the Heat Map by sampling the given number of cells from the full Expression Matrix. This option has no effect on the Dot Plot. Sampling works by sampling a fixed percentage of the cells in each grouping. For example, if there are 10 000 cells in the input, and `Max cells in heat map = 1 000', then sampling will aim to recover 1 000 / 10 000 = 10% of the cells for each grouping. In this example, a group with <5 cells would be omitted, because 10% of <5 would be rounded down to 0.

There are also usually too many features to allow for a meaningful visualization of all genes. Therefore several options can be used to select the most informative genes to visualize: