Count-based and extra-chromosomal filters
Low quality barcodes can arise for various reasons, such as damaged cells or library preparation problems. Potential low quality barcodes can be identified using the distributions of the following metrics:
- Total number of reads. Barcodes with few reads result from losing RNA during library preparation.
- Total number of expressed features. Barcodes with few expressed features indicate that the diverse transcript population has not been successfully captured.
- Percentage of reads mapped to spike-in control regions. When spike-in controls are used, barcodes with proportionally many reads mapped to the spike-in controls are symptomatic of loss of endogenous RNA, as the same amount of spike-in RNA should have been added to each cell.
- Percentage of reads mapped to features indicative of low quality. Barcodes with proportionally many reads mapped to certain features are indicative of low quality cells. For example, loss of cytoplasmic RNA from perforated cells can lead to high expression of mitochondrial genes in eukaryotes [Islam et al., 2014,Ilicic et al., 2016].
Count-based filters
In this dialog of QC for Single Cell, the filters using the total number of reads and expressed features can be enabled and customized.
Figure 7.6: The default settings in the Count-based filters dialog.
The dialog first allows for manually specifying a list of barcodes to be retained as cells in Barcodes to retain (figure 7.6). These would typically be barcodes that are otherwise removed by any of the filters applied. See Choosing barcodes to retain for details. The barcodes used in the Barcodes to retain have to meet the following criteria:
- Either all barcodes are prepended by the sample name and a "-", or no barcodes contain the sample name.
- Barcodes can be separated by any white-space characters, ",", and ";". This consequently requires that no barcodes contain any of the allowed separators.
The following options can be adjusted for the Count-based filters (figure 7.6):
- Remove cells with few reads. When checked, barcodes with fewer reads than a minimum threshold are removed.
- Remove cells with few expressed features. When checked, barcodes with fewer expressed features than a minimum threshold are removed.
- The minimum threshold can be:
- Calculated automatically from the distribution of number of reads/expressed features by using Calculate minimum from data. See Automatic thresholds for details.
- Specified manually by using Specify minimum.
Extra-chromosomal filters
In this dialog of QC for Single Cell, the filters using the percentage of reads mapped to spike-in controls and features indicative of low quality can be enabled and customized.
Figure 7.7: The default settings in the Extra-chromosomal filters dialog.
The following options can be adjusted in the Extra-chromosomal filters dialog (figure 7.7):
- Remove cells with many spike-in reads (%). When checked, barcodes with a percentage of reads mapped to spike-in controls that is greater than a maximum threshold are removed.
- Remove cells with many reads mapping to features indicative of low quality (%). When checked, barcodes with a percentage of reads mapped to features indicative of low quality that is greater than a maximum threshold are removed. Features indicative of low quality are defined using at least one of the following options:
- Chromosomes. The name of the mitochondria chromosome and/or other chromosomes containing only features indicative of low quality cells. Can be left empty or multiple chromosomes can be chosen.
- Feature tracks. Feature tracks containing only features indicative of low quality cells. Can be left empty or multiple tracks can be chosen.
- Feature names. Names or ids for features indicative of low quality cells. Any white-space characters, ",", and ";" are accepted as separators.
- The maximum threshold can be:
- Calculated automatically from the distribution of percentage of reads mapped to spike-in controls/features indicative of low quality by using Calculate maximum from data. See Automatic thresholds for details.
- Specified manually by using Specify maximum.