Identify and Annotate Differentially Expressed Genes and Pathways

The Identify and Annotate Differentially Expressed Genes and Pathways compares genes expression in different groups of samples and performs a gene ontology (GO) enrichment analysis on the differentially expressed genes to identify affected pathways. The workflow takes as input Gene Expression (GE) or Transcript Expression (TE) tracks that were generated using the RNA-Seq analysis tool. The samples must be associated to a metadata table.

The workflow can be found in the Toolbox at:

        Toolbox | Template Workflows | Biomedical Workflows (Image biomedical_twf_folder_open_16_n_p) | Whole Transcriptome Sequencing (Image rna_seq_group_closed_16_n_p) | Identify and Annotate Differentially Expressed Genes and Pathways (Image identify_differentially_expressed_genes_wts_16_n_p)

  1. If you are connected to a server, you will first be asked where you would like to run the analysis.

  2. Next, you will be asked to select the samples to analyze (figure 22.32). You can select several GE tracks or TE tracks generated by the RNA-Seq analysis tool, but not a combination of both.

    Image rnaseq_identify_differentially_expressed_genes_step2
    Figure 22.32: Select the GE or TE tracks to analyze.

  3. Then select the reference data set that should be used to annotate variants (figure 22.33).

    Image rnaseq_identify_differentially_expressed_genes_step1
    Figure 22.33: Choose the relevant reference data set to annotate.

  4. In the Differential Expression for RNA-Seq dialog, you can set up the experimental design associated with the data (figure 22.34):

    Image rnaseq_identify_differentially_expressed_genes_step3
    Figure 22.34: Specify the experimental design desired for running the workflow.

  5. In the last wizard step you can check the selected settings by clicking on the button labeled Preview All Parameters. In the Preview All Parameters wizard you can only check the settings, and if you wish to make changes you have to use the Previous button from the wizard to edit parameters in the relevant windows.

  6. Choose to Save your results and click on the button labeled Finish.

The following outputs are generated:

  1. PCA for RNA-Seq plot (Image pca_plot_16_n_p) Projects a high-dimensional dataset (where the number of dimensions equals the number of genes or transcripts) onto two or three dimensions.
  2. Statistical Comparison (Image stats_track_16_n_p) The information can be accessed in two different ways:
    • Open as a track, hold shift and hover over a feature. A tooltip will appear with information about gene name, results of statistical tests, and expression values.
    • Open the track in table format by clicking on the table icon in the lower left side of the View Area.
  3. Track List Differentially Expressed Genes and Pathways (Image trackset_16_n_p) A collection of tracks presented together. Shows the human reference sequence, annotation tracks for genes, coding regions CDS, mRNA, and statistical comparison tracks (see figure 22.35).
  4. Heat Map for RNA-Seq (Image heatmap_16_n_p) A two dimensional heat map of expression values. Each column corresponds to one sample, and each row corresponds to a feature (a gene or a transcript). The samples and features are both hierarchically clustered.
  5. Venn Diagram (Image venn_16_n_p) To compare the overlap of differentially expressed genes or transcripts in two or more statistical comparison tracks.
  6. Expression Browser (Image expression_experiment_16_h_p) To inspect gene and transcript expression level counts and statistics for many samples at the same time.
  7. GO Enrichment Analysis (Image table) A table showing the results of the GO enrichment analysis. The table includes GO terms, a description of the affected function/pathway, the number of genes in each function/pathway, the number of affected genes within the function/pathway, and p-values.

Image rnaseq_identify_differentially_expressed_genes_genomebrowserview
Figure 22.35: The Track List allows comparison of the expression comparison tracks with the reference sequence and different annotation tracks.

Please refer to the relevant sections of the http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=RNA_Seq_Analysis.html for additional information on the different output mentioned above.

#27068#>