Cluster Single Cell Data

Cluster Single Cell Data uses a graph-based clustering to automatically cluster cells. Typically the aim is to recover clusters that describe cells of different types or with different behavior.

The tool takes an Expression Matrix (Image expression_matrix_track_16_n_p) / (Image expr_matrix_spliced_unspliced_16_n_p), or a Peak Count Matrix (Image peak_count_matrix_16_n_p), or both types of matrix as input, and produces a Cell Clusters (Image cell_clusters_16_n_p) result. Note that when both types of matrices are provided, only cells that are in common to both matrices are used.

It can be found in the Toolbox here:

        Cell Annotation (Image sc_cell_annotation_folder_open_16_n_p) | Cluster Single Cell Data (Image autocluster_from_matrix_16_n_p)

The tool offers options to run dimensionality reduction or feature selection prior to clustering. For details on these options, please see Feature selection and dimensionality reduction. The following additional options are available:

The result of clustering is a Cell Clusters (Image cell_clusters_16_n_p) element containing clusters at different resolutions. It is easiest to view these in a Dimensionality Reduction Plot (Image singlecellplot_16_n_p).

Generally speaking, a good clustering will have distinct clusters for each large clump of cells that appears to form a cluster by eye in the Dimensionality Reduction Plot. If this is not the case, the resolution may be too low (as in figure 12.1, compared with figure 12.2). Unfortunately, it can be hard to tell when the resolution is too high, but generally one or more of the clusterings at a default resolution will be suitable for downstream analysis.

Image restoolow
Figure 12.1: Clustering with too low resolution. Clusters that are distinct by eye are given the same color. Examples include the three dark blue clusters at the top-right corner of the plot, and the two turquoise clusters at x=-20. Data is from [MacParland et al., 2018].

Image resbetter
Figure 12.2: A higher resolution clustering of the same data as in figure 12.1. Each cluster that seems distinct by eye is now given its own color. The resolution is no longer too low. It can be difficult to determine whether the resolution is too high.

As the aim of clustering is usually to have clusters that correspond to different cell types, it is possible, from the Dimensionality Reduction Plot, to redraw the boundaries between clusters, to add new clusters, and to rename clusters. These changes might be based on insights from other sources of information such as:



Subsections