Interpreting the output of Differential Expression for Single Cell

Differential Expression for Single Cell produces one or more Statistical Comparison Tables (Image sc_stat_comp_16_n_p).

For each gene, the table has several columns whose interpretation depends on whether the tests performed are `All group pairs' or `Identify marker genes'. The difference in interpretation arises because the output of `Identify marker genes' is a summary of several pairwise comparisons of the kind produced by `All group pairs'.

For example, with three groups: `Platelet', `B cell', and `T cell', `All group pairs' will perform tests such as `Platelet vs B cell', whereas `Identify marker genes' will perform tests such as `Platelet vs rest'. `Platelet vs rest', will be a summary of the pairwise comparisons `Platelet vs B cell' and `Platelet vs T cell'.

Differentially expressed genes (DEGs) and clustering. Groups are often defined based on clusters found using a clustering algorithm. Because clustering and differential expression analysis are performed on the same data, they are not independent. This means that, even for simulated data generated from the same distribution, random differences in expression between genes may drive the formation of clusters, and these same genes will then be found to be DEGs between the clusters. One remedy for this is to perform clustering on half the data and differential expression on the other half. However, it is more common to simply be cautious about over-interpreting results.
A similar warning can be made for groups defined based on cell types predicted by Predict Cell Types - the tool works by learning the expression pattern of different genes in different cell types. Therefore, it is likely that many DEGs between cell types assigned by Predict Cell Types have been implicitly learned by the tool, and may not be specific to the dataset being analyzed.

The Statistical Comparison Table also offers a volcano plot view, showing the relationship between the p-values and the log2 fold changes, see https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Volcano_plots.html for details.

Statistical Comparison Tables can be used in several tools from Toolbox | RNA-Seq and Small RNA Analysis (Image rna_seq_group_closed_16_n_p). The most useful of these in a single-cell context are:

It is also possible to automatically upload Statistical Comparison Tables to an existing Ingenuity Pathway Analysis account using the Ingenuity Pathway Analysis plugin https://digitalinsights.qiagen.com/plugins/ingenuity-pathway-analysis/

Note that many of these tools have options to filter features by Max group mean with a default filtering that is based on the RPKM measure of expression. This default will often need adjusting for single cell data where RPKM is rarely appropriate.