Output from Detect Fusion Genes from DNA
The tool produces the following outputs:
- Fusion genes. An annotation track (
) with the fusion breakpoints. See Fusion tracks below for more details.
- Unaligned ends. A reads track (
) with the unaligned ends. This can be useful in choosing optimal values for 'Minimum number of supporting reads' and visualizing the detected fusions.
- Fusion genes report. A report (
) with a summary of the fusions. See Detect Fusion Genes from DNA report below for more details.
Fusion tracks
The fusion annotation track outputs contain the fusion breakpoints.
The tool performs limited filtering of the detected fusions, see Filtering detected fusions for suggestions on how to perform additional filtering.
Fusions are named in the format 5' gene-3' gene. The table view (
) shows one breakpoint per row. In addition to standard information, it contains the following:
- IPA gene view. A link to QIAGEN Ingenuity Pathway Analysis with additional information about the fusion, if available.
- Fusion number. A unique identifier for the breakpoints associated with the same gene pair. Fusions with the strongest statistical support (low p-value, high z-score) are assigned the lowest fusion number.
- Fusion pair. For each fusion number, a unique identifier for the corresponding breakpoint pairs.
- 5'/3' gene. The name of the 5'/3' gene.
- Breakpoint type. The position of the breakpoint: 5' if located in the 5' gene, or 3' if located in the 3' gene.
- Filter. If the breakpoint pair is assigned a PASS status, it is marked as such; otherwise, the reasons for not passing are provided. See Filtered fusions below for more details.
- P-value. The probability that the breakpoint pair occurred by chance, in the absence of an actual fusion.
- Calculated using a binomial test from the number of reads supporting the breakpoint pair and the total 5' and 3' read coverage using a binomial test.
- Both fusion crossing reads and fusion spanning reads are included.
- Z-score. Calculated from the p-value using the inverse distribution function for a standard Gaussian distribution.
- Fusion crossing reads and Fusion spanning reads. The number of reads supporting the breakpoint pair. Specifically:
- Fusion crossing reads are reads that map across the breakpoint.
- Fusion spanning reads are paired reads that map to both sides of the breakpoint but without crossing it. If paired reads span multiple breakpoints, the counts are distributed proportionally to the number of fusion crossing reads.
- Fusion supporting reads are the total number of fusion crossing and fusion spanning reads.
- 5'/3' read coverage and 5'/3' spanning read coverage. The number of reads mapping in the vicinity of the breakpoint pair. Specifically:
- Read coverage consists of reads that map on the reference genome across the breakpoint, in addition to the fusion crossing reads, as described above.
- Spanning read coverage consists of paired reads that map on the reference genome on both sides of the breakpoint but without crossing it, in addition to the fusion spanning reads, as described above. If paired reads span multiple breakpoints, the counts are distributed proportionally to the read coverage.
- Frequency. The percentage of reads supporting the breakpoint pair.
- Calculated from the number of reads supporting the breakpoint pair and the total 5' and 3' read coverage.
- Both fusion crossing reads and fusion spanning reads are included.
- 5'/3' frequency. The percentage of reads supporting the 5'/3' breakpoint.
- Calculated from the number of reads supporting the 5'/3' breakpoint and the 5'/3' read coverage.
- Both fusion crossing reads and fusion spanning reads are included.
- Translocation name.
The fusion description in COSMIC format. The transcript with the highest priority (or the first in the list) is used.
This is present only when an mRNA track was provided as input.
- Found in-frame CDS. Indicates whether the breakpoints maintain the coding frame of the gene's transcript:
- No CDS. None of the 5' or 3' breakpoints overlap the transcript's CDS.
- In-frame. At least one breakpoint overlaps the transcript's CDS, and the resulting fusion CDS is in-frame.
- Out-of-frame. At least one breakpoint overlaps the transcript's CDS, but the resulting fusion CDS is out-of-frame.
This is present only when mRNA and CDS tracks are provided as input.
- Reversal. Indicates if the fusion reads change strand orientation, i.e., map to the same strand as the 5' gene but the opposite strand of the 3' gene, or vice versa (figure 31.58).
- Breakpoint position. The location of the breakpoint relative to the gene:
- Intergenic: outside the gene.
- Intragenic: within the gene. If an mRNA track was provided as input:
- Exonic: within an exon of the gene's transcript.
- Intronic: within an intron of the gene's transcript.
If multiple transcripts of the gene overlap the breakpoint, the transcript with the highest priority (or the first in the list) is used.
- Gene distance. The distance from the breakpoint to the nearest boundary of the gene. If the breakpoint is intragenic, the value is zero.
Filtered fusions
If 'Include all fusions in the track output' was checked, the fusion track contains all detected fusions. The 'Filter' column contains the reasons why breakpoint pairs were not assigned a PASS status:
- Excluded by fusion filter (<table name>), Excluded by fusion filter (names), or Not included by fusion filter. The fusion was:
- Excluded based on a table named '<table name>' provided in 'Fusions for filtering (tables)'.
- Excluded based on 'Fusions for filtering (names)'.
- Not included based on either a table provided in 'Fusions for filtering (tables)' or 'Fusions for filtering (names)'.
- No fusion crossing reads. Due to the absence of fusion crossing reads, the precise breakpoint location could not be identified. However, the fusion was supported by paired reads that mapped as broken pairs. For such fusions, the breakpoint region and 5'/3' gene are ill-defined:
- The breakpoint 'Region' encompasses the entire region of the 5'/3' gene.
- The 5' gene is the one that the first read in the pair maps to, if the read is mapped in the direction of the gene. Otherwise, the 5' gene is the one that the second read maps to.
- Too few supporting reads. The fusion did not meet the 'Minimum number of supporting reads' threshold.
- Both breakpoints overlap gene or Both breakpoints overlap both genes. The 5' and 3' genes overlap and:
- One breakpoint was located within the overlapping region, or
- Both breakpoints were located within the overlapping region.
These events are most likely false-positive fusions.
Detect Fusion Genes from DNA report
The report has the following sections:
- Summary. Summarizes the number of detected fusions.
- Unaligned ends. Summarizes the number of unaligned ends used to detect fusions:
- Unaligned ends. Number of identified unaligned ends.
- Mapped unaligned ends. Number of unaligned ends that could be mapped.
- Unmapped unaligned ends. Number of unaligned ends that could not be mapped.
- Breakpoint positions. Summarizes the Breakpoint position relative to genes for fusions with identified fusion crossing reads.
- CDS. Summarizes the Found in-frame CDS status for fusions with identified fusion crossing reads.
This is present only when mRNA and CDS tracks are provided as input.
