RNA-Seq and Differential Gene Expression Analysis workflow

The RNA-Seq and Differential Gene Expression Analysis workflow includes the steps necessary to carry out RNA-Seq analyses on trimmed reads, followed by a differential expression analysis of the RNA-Seq data. This workflow also generates reports and visualizations.

The workflow calculates expression profile for each sample individually and then conducts a differential expression analysis of all the samples, grouped based on information provided when launching the workflow. The ability to run parts of the workflow on a per-sample basis, and other parts based on groups of samples, is possible due to the inclusion of Iterate and Collect and Distribute elements in the workflow design (Workflow control flow elements).

If possible, samples with known results should be used to test and optimize the workflow settings to fit your specific application.

Some common adjustments of this workflow are provided at the end of this section.

Requirements for this workflow

To run this workflow, you will need:

In this section, we assume sequence lists containing your trimmed reads are in a single folder.

Launching the workflow

The RNA-Seq and Differential Gene Expression Analysis workflow is at:

        Toolbox | Template Workflows | Basic Workflow Designs (Image basic_twf_folder_closed_16_n_p) | RNA-Seq and Differential Gene Expression Analysis (Image rna_diff_expression_twf_16_n_p)

Launch the workflow and step through the wizard.

  1. Select the sequence lists containing your trimmed reads and click on Next.
  2. Select your reference data set or select "Use the default reference data" to configure the reference data elements individually in subsequent wizard steps.
  3. Choose the "Use metadata" for defining the batch units, and then select the Excel or CSV format file containing information about your samples. The metadata column to specify for defining the batch units is one with a unique entry per sample, where that information at least partially match the names of the input sequence elements. This would commonly be an ID for the samples (figure 12.79).
  4. In the next step, you can review the batch units resulting from your selections above.
  5. If you did not select a reference data set in the earlier step, then in the following steps, you will be prompted to specify the reference data elements to use.
  6. The differential expression settings are then specified (figure 12.80).
  7. Finally select a location to save results to and press Finish.

Image metadata_table_workflow
Figure 12.78: Metadata describing 10 samples from a tumor normal comparison experiment.

Tools in the workflow and outputs generated

The tools and outputs provided by this workflow are:

Most of the tools used by this workflow are located in the RNA-Seq and Small RNA Analysis toolbox described in RNA-Seq and Small RNA Analysis.

The RNA-Seq expression analysis is conducted at gene level and the differential expressions is hence reported at gene level. The workflow can easily be modified to conduct transcript level expression analysis. To modify the workflow you need to Open Copy of Workflow (right-click the workflow and select this option).

Image metadata_wizard
Figure 12.79: Atfer selection of the metadata file, select the samples identifyer in the drop-down menu.

Image wizard-for-selecting-de
Figure 12.80: Specify based on the metadata how the differentital expression analysis should be conducted. In this case we chose to test differential expression due to Group (Tumor/Normal) while controling for Sex (F/M).

Workflow outputs resulting from analyses of all samples shown in figure 12.78 such as the PCA plot and the Venn Diagram are saved directly at the top level of the results folder. Outputs that are sample specific are organized in relevant sub folders, except expression tracks, see figure 12.81. Expression tracks for all samples are stored in the folder Gene Expression Tracks.

Image output-folder-structure
Figure 12.81: Overview of the outputs produced and how the folders are structured.

Customizing the RNA-Seq and Differential Gene Expression Analysis workflow design

Template workflows can be easily edited to add or remove analysis steps, change parameter settings, and so on. See Template workflows for information about how to open a template workflow for editing.