Chromatin Accessibility and Expression Analysis from Reads
The workflow Chromatin Accessibility and Expression Analysis from Reads takes 10x Multiome ATAC and gene expression (GEX) reads as input and starts by annotating them with cell barcode and UMI, followed by trimming.
During the annotation the barcodes from the ATAC reads are translated to barcodes that match the cell barcodes of GEX reads. The ATAC reads are then mapped and, in case of multiple samples, combined into one before producing one Peak Count Matrix ().
The GEX reads are mapped to create one or more Expression Matrix (). These are then subject to quality control, normalization, clustering, cell type prediction and velocity analysis.
The workflow allows for a combined analysis of multiple samples to produce:
- a single, multi-sample Peak Count Matrix ()
- a single, multi-sample, normalized Expression Matrix ();
- a Dimensionality Reduction Plot () associated with the automated clusters, predicted cell types and additional cell annotations;
- a Heat Map (), a Dot Plot (), and a Violin Plot () with the predicted cell types as cell groups;
- a Cell Abundance Heat Map () with the automated clusters and predicted cell types as cell groups;
- a Phase Portrait Plot () with per gene information on the velocity dynamics;
- a Velocity Genes Scores () allowing identification of velocity genes driving the dynamics.
The workflow can be found in the Template Workflows section here:
Single Cell Workflows () | From Reads () | Chromatin Accessibility and Expression Analysis from Reads ()
If you are connected to a CLC Server via your Workbench, you will be asked where you would like to run the analysis. We recommend that you run the analysis on a CLC Server when possible.
You can choose either one or more Sequence lists or Select files for import and select FASTQ files for importing.
The workflow offers a number of options described below. Note that not all parameters can be configured. Open parameters indicate places where customization may be necessary for different samples, but default settings are suitable in most cases.
The workflow can be run using Single Cell hg38 (Ensembl) or Single Cell Mouse (Ensembl) reference data sets (see The Reference Data Manager).
Note: Reference data elements cannot be configured during workflow execution. If other elements than those provided in the default reference data sets are needed, a custom reference data set can be used, see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Custom_Sets.html. When creating custom reference data sets, the chosen gene track needs to match the gene annotations used for training the provided Cell Type Classifier () (see Features used for training and prediction). |
The workflow allows the analysis of multiple samples. Metadata must always be specified for configuring which inputs belong to which sample. In addition to group the input, metadata is converted to cell annotations and can be used for coloring the cells in the Dimensionality Reduction Plot.
For more details on configuring workflow execution with metadata, see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Running_workflows_in_batch_mode.html. Make sure to inspect the batch overview to check that the analysis will be performed correctly.
Subsections
- Configuring the batch units for Chromatin Accessibility and Expression Analysis from Reads
- Output from Chromatin Accessibility and Expression Analysis from Reads
- Importing reads