QIAseq UPX 3' Transcriptome Kits
Quality control summaryThe summary table presents key data points from the Quality control tab.
- Sample name: The name of the sample.
- Reads: The number of reads in the sample data.
- Low numbers indicate library prep issues.
- Spillover into unused wells causes those wells to return small numbers of reads.
- Trimmed reads: The number of trimmed input reads.
- UMI Reads: Unique Molecular Indexes join similar reads into UMI reads. This allows for better quantification of the RNAs by eliminating any library amplification and sequencing bias.
- Avg Q score, UMI reads: The average quality score of the UMI reads.
- Numbers less than 30 indicate poor quality library prep or instrument runs.
- Trimmed UMI reads: A second quality trimming is performed on the created UMI reads before mapping.
- Mapped: Percentage of UMI reads that mapped to the selected reference.
- If the percentage is low, check that you selected the correct reference species.
- Mapped to total rRNA: Percentage of reads mapped to ribosomal RNA.
- If rRNA was not depleted, percentages can vary from 10%-30%.
- If rRNA was depleted, percentages should be low, about or less than 1%.
- Samples not depleted of rRNA or samples with high rRNA percentage can still be used for differential expression, but expression values such as TPM and RPKM may not be comparable to those of other samples. To troubleshoot the issues in future experiments, check for rRNA depletion prior to library preparation. Also, if an rRNA depletion kit was used, check that the kit matches the species being studied.
Trimming, raw Reads
Metrics for trimming of raw input reads. If the number of reads and average read length decrease substantially after trimming, it indicates poor reads quality.
- Reads before trim: The number of raw reads prior to trimming. This value is also shown in the 'Reads' column in the summary table.
- Avg length before trim: The average read length before trimming.
- Reads after trim: The number of raw reads after trimming. This value is also shown in the 'Trimmed reads' column in the summary table.
- Avg length after trim: The average read length after trimming.
Creation of UMI reads
Metrics from the process of creating UMI reads from reads that have the same Unique Molecular Index.
- Read pairs and single reads annotated with UMIs: The percentage of reads that were annotated with UMIs.
- Input Reads: The number of raw reads used as input for the creation of UMI reads.
- Avg Q score, input reads: The average quality score of the input raw reads.
- Numbers less than 30 indicate poor quality library preps or instrument runs.
- Detected barcode length: The sequence length of the UMI barcode is defined in the UMI protocol. This value can be used as a check to see if the analysis detected the expected length.
- UMIs: The number of created UMI reads.
- Avg reads per UMI: The average number of raw reads in each UMI read.
- Should be greater than one. For most applications, the ideal UMI group size will be around 2-4.
- UMIs with more than 10 reads (Pct of UMIs) (Pct of reads): This column highlights the extreme end of the UMI grouping distribution and should be seen as indications of potential problems in sequencing or library prep. For most applications, the ideal merged UMI group size will be around 2-4 reads. Larger UMI groups consume sequencing capacity without providing additional benefits.
- A low percentage is preferable. If the percentage is higher than 5, or there are large variation between samples, a re-evaluation of the sample prep may be needed.
- Max reads per UMI: This indicates the extreme end of the UMI grouping to highlight distribution and potential problems in sequencing.
- Avg Q Score, UMI reads: The average quality score of the UMI reads. This should be higher than the average quality score for the input reads.
- Numbers less than 30 indicate poor quality library preps or instrument runs.
Trimming, UMI reads
Metrics for trimming of merged UMI reads. If the number of UMI reads and average read length decreases substantially after trimming, it indicates poor quality UMI reads. Trimming is performed on the UMI reads even though they are merged from reads that have already been quality trimmed. Very few reads should be trimmed at this step, and if a considerable amount are trimmed it could indicate of a problem in the UMI process or further upstream in the pipeline.
- UMI reads before trim: The number of UMI reads before trimming.
- Avg length before trim: The average UMI read length before trimming.
- UMI reads after trim: The number UMI reads after trimming.
- Avg length after trim: The average UMI read length after trimming.
Spike-in quality control
This section is available if the Spike-ins option was selected in the Align and Count dialog.
- Number of spike-ins detected: The number of spike-ins detected relative to the spike-ins used.
- R2: Correlation of expected and sequenced spike-ins using the Pearson Correlation coefficient
- When samples have a poor correlation (R2 < 0.8) between known and measured spike-in concentrations, it indicates problems with the spike-in protocol or a more serious problem with the sample.
- Reads mapped to spike-ins: The number of reads that mapped to the detected spike-ins.
- If fewer than 10,000 reads mapped to spike-ins, consider using more spike-in mix in future experiments.
- Lower limit of detection (attomoles/ul): Spike-ins concentration measurement. The lower limit of detection is the lowest concentration spike-in to which at least 3 reads map. This provides a rough estimate of the minimal concentration of mRNA that can be detected in this sample.
Mapping statistics
Metrics for the step of mapping UMI reads to the reference genome.
- UMI reads: The number of UMI reads used as input for the mapping step.
- Paired (yes/no): Indication of whether the input reads are paired reads. This should fit with the applied protocol.
- Reads mapped: The percentage of the UMI reads that were mapped to the reference. This excludes both reads that were ignored due to wrong strand and reads that could not be mapped to the reference.
- Strand-specific setting: Read direction of UMI reads. This should fit the applied protocol.
- Forward % of reads mapped: Percent of UMI reads mapped in the forward direction.
- Reverse % of reads mapped: Percent of UMI reads mapped in the reverse direction.
- Ignored reads % (wrong strand): The percentage of UMI reads that were ignored because the did not map to the strand defined by the strand-specific setting.
- A percentage greater than 20-25% indicates that the wrong strand protocol may have been used in library prep.
Mapped by Type
This section describes the relative mapping of the UMI reads in terms of the type of target.
- Mapped to gene: Percentage of UMI reads that map to genes.
- Mapped to gene, intron: Percentage of UMI reads that mapped partly or entirely within an intron.
- Mapped to gene, exon: Percentage of UMI reads that mapped entirely within an exon or to an exon-exon junction.
- Mapped to intergenic region: Percentage of UMI reads that mapped partly or entirely between genes.
Biotype Distribution
Metrics covering the biotype distribution. The plot and the table show which biotypes are found in the samples and at which relative abundance. The names and classification of the biotypes is based on the Ensembl definitions found here:
http://www.ensembl.org/info/genome/genebuild/biotypes.html.
Taxonomic profile of unmapped reads
Taxonomic profiling is performed for samples with a high level of unmapped reads as this can indicate sample contamination. If all samples have low levels of unmapped reads, this section will be empty.
Plot and table show the relative abundance at phylum level. Reads that map equally well to two or more phyla are assigned to the common ancestor (kingdom level).
Taxonomic profiling summary
Information about which taxonomic levels were found in the data sample and how many different taxa were found on each level.
- Kingdom
- Phylum
- Total reads: The number of reads used as input for this step, i.e. reads that did not map the reference genome.
- Classified reads: The number of reads that mapped to the taxonomic profiling database.
- Unclassified reads: The number of reads that mapped to neither the reference nor the taxonomic profiling database. These are reads of unknown origin. If this number constitutes a significant portion of the input reads, it may be due to the selection of the wrong reference.