Detect and Refine Fusion Genes

Detect and Refine Fusion Genes finds fusion genes in a two-step process. The detect step identifies potential fusions and the refine step accumulates and evaluates the evidence for each fusion. Briefly, the detect step works by re-mapping the unaligned ends of reads and determining if these are consistent with a fusion. Fusions are identified from reads that must have an unaligned end close to an exon boundary that can be remapped close to another exon boundary. If the option for Detect fusions with novel exon boundaries has been enabled, the tool also considers reads that are far from an exon boundary and/or whose unaligned ends can be mapped far from an exon boundary in a second pass.

The refine step takes the fusions identified in the detect step, and re-counts the number of fusion crossing reads as well as the wildtype supporting reads using an RNA-Seq mapping against the wild type and fusion references. The fusion reference is an artificial reference sequence that "assumes" the detected fusions by generating new chromosomes corresponding to each fusion in addition to the original chromosomes (figure 31.51).

Image refinefusion
Figure 31.54: An artificial chromosome is created consisting of the vicinity of both ends of the fusion.

All reads are remapped to the artificial reference, with the expectation that reads that were used to detect the fusion will now map to the fusion transcript with a spliced read. In addition, some reads that did not originally map at all will now map to the artifical reference sequence, increasing evidence for the fusion event. The tool then calculates the Z-score and p-value using a binomial test.

The Detect and Refine Fusion Genes tool can be found in the Toolbox at:

        Toolbox | RNA-Seq and Small RNA Analysis (Image rna_seq_group_closed_16_n_p)| RNA-Seq Tools (Image rna_expert_folder_closed_16_n_p) | Detect and Refine Fusion Genes (Image find_fusions_16_n_p)

The Detect and Refine Fusion Genes tool takes takes a sequence list (Image seq_list_nucleotide) as input (figure 31.52).

Image detect_and_refine_1
Figure 31.55: Select sequences.

In the next dialog (figure 31.53), specify the RNA-Seq reads track, as well as reference sequence, gene and mRNA tracks from the CLC_References folder of the Navigation Area. It is possible, but optional, to add a CDS or primer track to run the analysis.

Image detect_and_refine_2
Figure 31.56: Specify reads track and references.

In the next dialog (figure 31.54), configure parameters for detecting fusion genes:

Image detect_and_refine_parameters_2
Figure 31.57: Default parameters for detecting fusion genes.

In the next dialog (figure 31.55), configure parameters for refining the fusion genes:

Image detect_and_refine_3
Figure 31.58: Default parameters for refining fusion genes.

The remaining parameters apply to the RNA-Seq read mapping to the artificial references (see Mapping settings for details).



Subsections