The Cluster Single Cell Data algorithm

Cluster Single Cell Data is a graph-based clustering method. It proceeds in three phases:

  1. Construction of a k-nearest neighbor graph (kNN), where each cell is a node in the graph with edges to its k nearest neighbors.
  2. Construction of a Shared Nearest Neighbor (SNN) graph from the kNN graph using the method of [Xu and Su, 2015]. Briefly, in the SNN graph each cell is again a node, but two cells are only connected by an edge if they share a nearest neighbor in the kNN graph. Neighbors of each cell in the kNN graph are ranked from 1 (the same cell, because each cell is its own closest neighbor) to k (the most distant neighbor). Edges in the SNN graph are weighted according to the best of the average ranks of their shared neighbors. Edges connecting cells that share close nearest neighbors are weighted higher than edges connecting cells that only share distant nearest neighbors.
  3. Application of Leiden community detection to the weighted SNN graph [Traag et al., 2019].

The Leiden community detection algorithm has two hyperparameters. These are set as follows: