Expression settings

Expression settings are defined in the dialog shown in figure 33.51.

Image rnaseq_long_reads_dialog_step3
Figure 33.51: Set strand setting and define how expression values should be calculated.

These parameters determine the way expression values are counted.

Definition of RPKM

RPKM, Reads Per Kilobase of exon model per Million mapped reads, is defined in this way [Mortazavi et al., 2008]:

$\displaystyle \emph{RPKM} = \frac{\emph{total exon reads}}{\emph{mapped reads(millions)} \times \emph{exon length (KB)}}. $

For prokaryotic genes and other non-exon based regions, the calculation is performed in this way:

$\displaystyle \emph{RPKM} = \frac{\emph{total gene reads}}{\emph{mapped reads(millions)} \times \emph{gene length (KB)}}. $

Total exon reads.
This value can be found in the column with header Total exon reads in the expression track. This is the number of reads that have been mapped to exons (either within an exon or at the exon junction). When the reference genome is annotated with gene and transcript annotations, the mRNA track defines the exons, and the total exon reads are the reads mapped to all transcripts for that gene. When only genes are used, each gene in the gene track is considered an exon. When an un-annotated sequence list is used, each sequence is considered an exon.
Exon length.
This is the number in the column with the header Exon length in the expression track, divided by 1000. This is calculated as the sum of the lengths of all exons (see definition of exon above). Each exon is included only once in this sum, even if it is present in more annotated transcripts for the gene. Partly overlapping exons will count with their full length, even though they share the same region.
Mapped reads.
The sum of all mapped reads as listed in the RNA-Seq analysis report. For more information on how expression is calculated in this case, see above.