Coverage summary report
An example of a coverage report is shown in figure 26.4.
Figure 26.4: The report with overviews of mapped reads.
This figure shows only the top of the report. The full content is explained below:
- Target regions
- Summary This table shows overall coverage information.
- Number target regions. Number of targeted regions.
- Total length of target regions. Sum of the size of all the targeted regions (this means it is calculated from the annotations alone and is not influenced by the reads).
- Average coverage. For each position in each target region the coverage is calculated and stored: you can see the individual coverages in the Coverage table output, figure 26.9). The average coverage is calculated by taking the mean of all the calculated coverages in all the positions in all target regions. Note that if the user has chosen the Read filters options "Ignore non-specific matches" or "Ignore broken pairs", these reads will not contribute to the coverage. Note also that bases in overlapping paired reads will only be counted as 1.
- Median coverage. Is calculated by taking the median of all the calculated coverages in all the positions in all target regions. As specified above, if the user has chosen the Read filters options "Ignore non-specific matches" or "Ignore broken pairs", these reads will not contribute to the coverage. Note also that bases in overlapping paired reads will only be counted as 1.
- Number of target regions with coverage below x. Number of target regions which have positions with a coverage that is below the user-specified "Minimum coverage" threshold.
- Total length of target regions containing positions with coverage below x.
- Total length of target regions with a coverage below x.
- Fractions of targets with coverage at least... A table and a histogram show how many target regions have a certain percentage of the region above the user-specified Minimum coverage threshold.
- Coverage of target regions positions A first plot shows the coverage level on the x axis, and the number of positions in the target regions with that coverage level. Below is a version of the histogram above zoomed in to the values that lie +- 3SDs from the median.
- Minimum coverage of target regions This shows the percentage of the targeted regions that are covered by this many bases. The intervals can be specified in the dialog when running the analysis. Default is 1, 5, 10, 20, 40, 80, 100 times. In figure 26.5 this means that 26.58 % of the positions on the target are covered by at least 40 bases.
Figure 26.5: Minimum coverage of target regions of the report.
- Summary This table shows overall coverage information.
- Targeted regions overview
This section contains two tables: one that summarizes, for each reference sequence, information relating to the reads mapped, and one that summarizes, for each reference, information relating to the bases mapped (figure 26.6).
Figure 26.6: Targeted regions overview of the report: mapped bases.Note that, for the table that is concerned with reads, reads in overlapping pairs are counted individually. Also note that, for the table that is concerned with bases, bases in overlapping paired reads are counted only as one (Examples are given in figures 26.7 and figure 26.8).
- Reference The name of the reference sequence.
- Total mapped reads/bases The total number of mapped reads/bases on the reference, including reads mapped outside the target regions.
- Mapped reads/bases in targeted region Total number of reads in the targeted regions. Note that if there are overlapping regions, reads covered by two regions will be counted twice and bases included in two regions will be counted twice. If a read is only partially inside a targeted region, it will still count as a full read.
- Specificity The percentage of the total mapped reads/bases that are in the targeted regions.
- Total mapped reads/bases excl ingored The total number of mapped reads/bases on the reference, including reads/bases mapped outside the target regions, excluding the non-specific matches or broken pairs (or the bases in non-specific matches or broken pairs), if the user has enabled the option to ignore those.
- Mapped reads/bases in targeted region excl ingored Total number of reads/bases in the targeted regions, excluding the non-specific matches or broken pairs (or the bases in non-specific matches or broken pairs), if the user has enabled the option to ignore those.
- Specificity excl ingored The percentage of the total mapped reads/bases that are in the targeted regions.
In addition, two plots called Distribution of target region length display the length of the target regions for all regions, and the second one where only the target region lengths that lie within +3SDs of the median target length are shown.
- Base coverage relative to mean coverage
- Base coverage The percentage of base positions in the target regions that are covered by respectively 0.1, 0.2, 0.3, 0.4, 0.5 and 1.0 times the mean coverage, where the mean coverage is the average coverage given in table 1.1. Because this is based on mean coverage, the numbers can be used for cross-sample comparison of the quality of the experiment.
- Base coverage plot A plot showing the relationship between fold mean coverage and the number of positions. This is a graphical representation of the Base coverage table above.
- Mean coverage per target
Three plots listing the mean coverage for each position of the targeted regions. The first plot shows coverage across the whole target, using a percentage of the target length on the x axis (to make it possible to have targets with different lengths in the same plot). This is reported for reverse and forward reads as well. In addition, there are two plots showing the same but with base positions on the x axis counting from the start and end of the target regions, respectively. These plots can be used to evaluate whether there is a general tendency towards lower coverage at the end of the targeted region, and whether there is a bias in terms of forward and reverse reads coverage.
- Read count per %GC
The plot shows the GC content of the reference sequence on the X-axis and the number of mapped reads on the Y-axis. This plot will show if there is a basis caused by higher GC-content in the sequence.