Detect and Refine Fusion Genes

Detect and Refine Fusion Genes identifies candidate fusion genes using information from unaligned ends of reads in a mapping generated by RNA-Seq Analysis. Candidate fusion genes are then evaluated by mapping a full set of reads back to a reference set containing the original gene sequences and the candidate fusion gene sequences. Statistical scores, reports and other outputs are generated, allowing detailed consideration of the evidence for fusion genes in the data set.

Identifying fusion genes is a two-step process.

  1. The detect step: Potential fusions are identified by re-mapping to the reference the unaligned ends of reads in the mapping. Reads that have an unaligned end close to an exon boundary that can be remapped close to another exon boundary are consistent with a fusion event. Reads with unaligned ends that map far from an exon boundary can also be considered by enabling the option "Detect fusions with novel exon boundaries".
  2. The refine step: The evidence for each detected fusion is evaluated. Sequences representing potential fusion genes are created (figure 31.51), and all reads are mapped in an RNA-Seq mapping against the original reference sequences plus the potential fusion gene sequences. The number of reads supporting each fusion gene are counted, and the number of reads supporting the genes from the original RNA-Seq Analysis are counted. Z-scores and p-values for the fusion genes are then calculated using a binomial test.

See RNA-Seq Analysis for information about RNA-Seq mappings.

Figure 31.54: An artificial chromosome is created consisting of the vicinity of both ends of the fusion.

The Detect and Refine Fusion Genes tool can be found in the Toolbox at:

        Toolbox | RNA-Seq and Small RNA Analysis (Image rna_seq_group_closed_16_n_p)| RNA-Seq Tools (Image rna_expert_folder_closed_16_n_p) | Detect and Refine Fusion Genes (Image find_fusions_16_n_p)

The tool takes as input one or more sequence lists (Image seq_list_nucleotide). The following references are required (figure 31.52):

Optionally, a CDS and Primer tracks can be provided to obtain information about CDS and primers for the identified fusion genes, see Output from the Detect and Refine Fusion Genes tool.

Figure 31.55: Reference tracks for Detect and Refine Fusion Genes.

The following options can be adjusted (figures 31.53 and 31.54):

Figure 31.56: Default options for detecting fusion genes.

Figure 31.57: Default options for refining fusion genes.