Cell format in importers
All importers in the CLC Single Cell Analysis Module import information about cells. Cells are identified by a combination of their barcode, e.g. "AAGCT", and their sample name.
Importers share the following common options:
- Cell format. Specify how to extract the barcode and, optionally, the sample name from the name of the cell. A combination of placeholders and text can be used (figures 4.9 and 4.10) to extract the part(s) of the name corresponding to the barcode/sample. Hover the mouse cursor over the field to see a tooltip with several examples. Simultaneously pressing Shift and F1 displays all available placeholders.
By default, the entire name of the cell is used as barcode and the sample name is based on the name of the imported file.
Figure 4.9: Extracting the barcode and sample from the name of the cell. 'Preview cells' shows how the cell names are parsed into sample and barcode.
Figure 4.10: The top panel shows the results of importing a matrix file with Cell format = {barcode}. After import, the sample name is the name of the file that was imported, and the barcode is the entire name of the cell. In the bottom panel, Cell format = SRX41800{sample}_filter.{barcode}. Here, the sample name and the barcode are extracted from the name of the cell, and other parts of the name are discarded. - Sample (Optional). This can be used for specifying a custom sample name. It should only be used when the file contains just one sample. It overrides the default sample name.
This is relevant e.g. when jointly analyzing an imported Expression Matrix and Peak Count Matrix, where cells must have the same sample name.
Importers contain a Preview cells section displaying how the cell names are parsed into sample and barcode (figure 4.9). The preview helps ensure that the configured Cell format matches the input (figure 4.11). If the configured format is invalid, the preview may fail to determine the sample and/or barcode (figure 4.12).
Figure 4.11: The preview helps identify that the sample and barcode have been swapped.
Figure 4.12: The preview helps identify that the configured cell format is not valid. The tooltip contains a detailed error message.
The preview can be disabled if not needed. This is useful for input files that are large, where generating the preview may take some time.