Create experiment analysis workflow
The differential expression analysis report summarizes the analysis results from the following tools. Specific parameters for the individual sample kits are listed in Appendix C.
- Differential Expression for RNA-Seq. Performs a differential expression test using multi-factorial statistics based on a negative binomial GLM. The tool supports paired designs and can control for batch effects.
- Create Heat Map for RNA-Seq. Samples and features are hierarchically clustered to create a two dimensional heat map of expression values. Each column contains data for a particular sample, and each row contains data for a particular feature.
The following filtering and normalization is carried out when producing the heat map:- Log CPM (Counts per Million) values are calculated for each feature. The CPM calculation uses the effective library sizes as calculated by the TMM normalization.
- A Z-score normalization is performed across samples for each gene: the counts for each gene are mean centered, and scaled to unit variance.
- Genes or miRNAs with zero expression across all samples or invalid values (NaN or +/-Infinity) are removed.
The tools are described in more detail in the QIAGEN CLC Genomics Workbench manual:
- https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/2103/index.php?manual=Differential_Expression_RNA_Seq.html
- https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/2103/index.php?manual=Create_Heat_Map_RNA_Seq.html
IPA submission and analysis. Prior to being passed on to IPA, the list of features is marked to indicate which to consider for the IPA analysis. The approach is as follows:
- Select features with an FDR p-value <0.05. If this leaves more than 1000 features, then:
- Select the 1000 features with the largest absolute fold change, keeping the ratio of up- and down-regulated features.
The uploaded features are used as input for an IPA Core Analysis.
Identification of qPCR normalization genes. The following filtering and selection is applied to identify stably expressed genes that can be used for normalization is e.g. qPCR assays.
- Filter features:
- Absolute fold change <1.03
- FDR p-value >0.05
- Minimum CPM (Counts per Million, TMM-adjusted) for each group >25
- Select the 20 features with smallest absolute fold change