QIAseq Stranded RNA Library Kits (FastSelect RNA Library Kit, Stranded mRNA/Stranded Total RNA Library Kits), QIAseq UPXome Kit, and all third-party kits
Quality control summaryThis first summary table is a combination of the most important data points from the Quality control report. All the data can be seen in the context of related QC data below.
- Sample name: In all tables the fist column is the sample name as the data relates to each sample per row.
- Reads: The number of input reads in the sample data
- Low numbers in any samples indicate failed library prep.
- Trimmed reads: Quality trimming is performed on the input reads.
- Mapped/Mapped in pairs: This section changes name depeding on the input data being paried on single-stranded.
- Percentages should be high as this indicates how well the trimmed reads fits the reference. A low percentage here can be an indication selection of an incorrect reference in the Align and count step, poor data quality, or potentially contamination.
- Mapped to total rRNA: Percentage of reads mapped to ribosomal RNA (rRNA) or mitochondrial ribosomal RNA (MtrRNA).
- If rRNA was not depleted, percentages can vary widely from 10%-30%.
- If rRNA was depleted, percentages should be low, about or less than 1%.
- Samples not depleted of rRNA or samples with higher percentages can still be used for differential expression, but expression values such as TPM and RPKM may not be comparable to those of other samples. To troubleshoot the issues in future experiments, check for rRNA depletion prior to library preparation. Also, if an rRNA depletion kit was used, check that the kit matches the species being studied.
Trimming
These numbers highlight the process of quality trimming showing how many reads were used as input, how many reads remained after trimming and the average read length before and after trimming. A large drop in number of reads and average read length is an indication of poor quality reads.
- Reads before trim: This is also shown in the 'Reads' column in the summary table.
- Avg length before trim
- Reads after trim: This is also shown in the 'Trimmed reads' column in the summary table.
- Avg length after trim
Spike-ins quality control
This section appears when the sample analysis started in the Align and Count dialog has checked the Spike-ins option.
- Number of spike-ins detected: The number of spike-ins detected relative to the spike-ins used.
- R2: Correlation of expected and sequenced spike-ins using the Pearson Correlation coefficient
- When samples have a poor correlation (R2 < 0.8) between known and measured spike-in concentrations, it indicates problems with the spike-in protocol or a more serious problem with the sample.
- Reads mapped to spike-ins: The number of reads that mapped to the detected spike-ins.
- If fewer than 10,000 reads mapped to spike-ins, consider using more spike-in mix in future experiments.
- Lower limit of detection (attomoles/ul): Spike-ins concentration measurement. The lower limit of detection is the lowest concentration spike-in to which at least 3 reads map. This provides a rough estimate of the minimal concentration of mRNA that can be detected in this sample.
Mapping statistics
This describes how the reads were used in the mapping step.
- Reads: The number of reads in each sample, after the trimming step mentioned above.
- Paired (yes/no): Indicating whether or not the input reads are paired reads or not. This should fit with the applied protocol.
- Reads mapped (in pairs): The percentage of the reads that were mapped to the selected reference. Excludes both reads that were ignored due to wrong strand and reads that could not be mapped to the reference.
- Strand-specific setting: Read direction of reads. This should fit the applied protocol.
- Forward % of reads mapped: Percent of reads mapped in the forward direction.
- Reverse % of reads mapped: Percent of reads mapped in the reverse direction.
- Ignored reads % (wrong strand): The percentage of reads that were ignored due to not meeting the defined protocol specified strand setting.
- If percentages are greater than 20-25%, then the wrong strand protocol may have been used. The strand setting is locked in the analysis workflow and cannot be changed, so any problem in this section should be corrected in the sample prep process.
Mapped by type
This section describes the relative mapping of the UMI reads in terms of the type of target.
- Mapped to gene: Percentage of reads that map to genes.
- Mapped to gene, intron: Percentage of reads that mapped partly or entirely within an intron.
- Mapped to gene, exon: Percentage of reads that mapped entirely within an exon or in an exon-exon junction.
- Mapped to intergenic region: Percentage of reads that mapped partly or entirely between genes.
Biotype Distribution
Details of the various biotype detection levels in each sample. The content of the plot and table depends on the result of the analysis and may vary between pipelines and sample batches. The point of both the plot and the table is to show which biotypes are found in the samples and at which relative abundance in each sample. The names or clasification of the biotypes is based on the Ensembl definitions found here: http://www.ensembl.org/info/genome/genebuild/biotypes.html
Taxonomic profile of unmapped reads
Taxonomic profiling is performed for samples with a high level of unmapped reads as this can indicate sample contamination. If all samples have low levels of unmapped reads, this section will be empty. Plot and table show the relative abundance at phylum level. Reads that map equally well to two or more phyla are assigned to the common ancestor (kingdom level).
Taxonomic profiling summary
Information about which taxonomic levels were found in the data sample and how many different taxa were found on each level.
- Kingdom
- Phylum
- Total reads: The number of reads that were not mapped to the reference
- Classified reads: The number of reads that were able to map to the taxonomic profiling database
- Unclassified reads: The number of reads that were unable to map to either the reference or the taxonomic profiling database. These are reads of unknown origin. If this number constitutes a significant portion of the input reads, it is likely due to the selection of the wrong reference in the Align and count setup for the sample creation and analysis. A new Align and count analysis will have to be be initiated.