Normalization

Since the sequencing depth might differ between samples, a per-sample library size normalization must be performed before samples can be compared. In contrast to the classic Transcriptomics Analysis tools, this normalization is automatically applied by the tools.

All of the tools in the Advanced RNA-Seq plugin use the TMM (trimmed mean of M values) normalization method [Robinson and Oshlack, 2010] to calculate effective libraries sizes, which are then used as part of the per-sample normalization. TMM normalization is the normalization used in EdgeR [Robinson et al., 2010].

TMM normalization adjusts library sizes based on the assumption that most genes are not differentially expressed. Therefore, it is important not to make subsets of the count data before doing statistical analysis or visualization, as this can lead to differences being normalized away.

For the expression visualization tools (Create Heat Map and PCA for RNA-Seq) additional filtering and normalization are performed: