RNA-Seq Analysis
The following describes the overall process of the RNA-Seq analysis when using an annotated eukaryote genome (see Specifying reads, reference genome and mapping settings for more information on other types of reference data).
The RNA-Seq analysis is done in several steps: First, all annotated transcripts are extracted (using an mRNA track). If there are several annotated splice variants, they are all extracted.
An example is shown in figure 31.2.
Figure 31.2: A simple gene with three exons and two splice variants.
This is a simple gene with three exons and two splice variants. The transcripts are extracted as shown in figure 31.3.
Figure 31.3: All the exon-exon junctions are joined in the extracted transcript.
Next, the reads are mapped against all the transcripts, and to the whole genome. For more information about the read mapper, see Map Reads to Reference.
From this mapping, the reads are categorized and assigned to the transcripts using the EM estimation algorithm, and expression values for each gene are obtained by summing the transcript counts belonging to the gene.
Subsections
- Reads and reference settings
- Mapping settings
- The EM estimation algorithm
- Expression settings
- Output settings
- RNA-Seq result handling
- Expression tracks
- RNA-Seq reads track
- RNA-Seq report
- Gene fusion reporting