Interpretation of fusion results
The Detect and Refine Fusion Genes tool produces multiple outputs, see 31.4.2. Out of these, the Report, the Fusion Genes (fusions) and Reads (fusions) provide the easiest way to review the results. Using a Track List containing the Fusion Genes (fusions) and Reads (fusions) we can inspect how the reads map near the fusion breakpoints, as shown in figures 31.57 and 31.58. Key aspects to verify when inspecting the results of fusion calls are:
- Reads map uniquely across the fusion breakpoints
In figure 31.57 many reads map to the artificial fusion chromosome. Although reads that do not map uniquely (yellow) are not counted as fusion crossing reads in the fusion statistics, a large fraction of yellow reads may still indicate a potential case where the fusion call might be less reliable. For example, it may indicate a high degree of homology between genes targeted by the panel.
Figure 31.60: An example of reads supporting a possible false positive fusion. - There is no sign of incomplete poly-A trimming
Another cause of false positives is incomplete poly-A trimming. In these cases one side of the fusion has normal complexity, but the other side of the fusion is A-rich. Figure 31.58 shows a clear example of a false positive caused by incomplete poly-A trimming.
Figure 31.61: A false positive fusion CCND2-SLC13A4 caused by incomplete poly-A trimming. - The fusion is not a known false positive fusion
Fusions where one of the gene partners is either mitochondrial or HLA can also for the most part be disregarded as they are found regularly in normal RNA-Seq data.
The following fusions can additionally be disregarded as common read-through mRNAs or false fusions due to gene homology.
- HALC1-COLQ, common read through
- TMP3-TMP4 (TMP4-TMP3), homologous genes
- Fusions involving insertion of intronic sequence are well supported
Fusions that include the insertion of intronic sequence can be detected. In the fusion plot, the intronic sequence appears as a "novel exon" indicated with a gray box. If one of the fusion partners is a novel exon, and there is otherwise no support for the fusion, the fusion should be treated with caution. Such fusions as not filtered away by default as missing support may be a consequence of the primer design.
For example, figure 31.59 shows a fusion where the novel exon has no support except for the fusion crossing reads. Additionally, although there are 431 such reads, they do not fuse into an annotated exon boundary, but instead into the middle of an exon. This fusion is likely to be a false positive.
Figure 31.62: Fusion plot for a likely false positive fusion where the insertion of intronic sequence is only supported by fusion crossing reads that are not at an exon boundary.Figure 31.60 shows a fusion where the novel exon is supported both by 48 fusion crossing reads spliced at an annotated exon boundary, and by 3 reads that independently show splicing from the novel exon into an annotated exon. This is a true positive fusion.
Figure 31.63: Fusion plot for a true positive fusion of PML-RARa that includes the insertion of intronic sequence. The fusion is supported by 48 fusion crossing reads at an exon boundary and 3 reads from the intronic sequence into an annotated exon.
Note about false negative fusions
The parameters for statistical significance are deliberately conservative. The default Assumed error rate is 0.001, meaning that at least 1 in 1000 reads covering the breakpoints should support the fusion. The default Maximum p-value is 0.005. For detecting very low frequency fusions these parameters should be adjusted - to make the filtering less conservative either increase the p-value or decrease the assumed error rate.
Note that while fusions that do not meet the statistical significance threshold will not be shown in the report, they can still be found in the Fusion Genes (fusions) track, where they will have the filter annotation "High p-value".