Importing data on the fly
There are two ways that raw data, i.e. data not already imported into the CLC software, can be imported as part of a workflow run:
- Include an Input element in the workflow design, and when launching the workflow, choose the option "Select files for import". This is referred to as "on-the-fly" import.
- Include a dedicated Import element in the workflow design.
Examples of these 2 design types are shown in figure 11.34. How these translate when launching the workflow is shown in figure 11.35. The relative merits of each option are outlined in table 11.1. For most uses, on-the-fly import will be the most versatile option.
Figure 11.34: Raw data can imported as part of a workflow run in 2 ways. Left: Include an Input element. and use on-the-fly import. Right: Use a specific Import element. Here, the Illumina import element was included.
Figure 11.35: Top: Launching a workflow with an Input element and choosing to select files to import on-the-fly. Bottom: Launching a workfow with a dedicated import element, in this case, an Illumina import element.
|
Notes:
- Modified copies of imported data elements can be saved, no matter which of the import routes is chosen. For example, an Output element attached to a downstream Trim Reads element would result in Sequence Lists containing trimmed reads being saved.
- The use of Iterate elements to run all or part of a workflow in batches is described in Running part of a workflow multiple times.
- Paired read handling for workflows launched in batch mode or workflows with Iterate elements: When batch units are based on metadata, or are based on data organization where each batch unit is in a separate folder, paired reads are handled as described in the documentation for the NGS importer tools (Import high-throughput sequencing data). When batch units are based on data organization and all files are in the same folder, each file is treated as a separate batch unit irrespective of whether the Paired option is checked. The batch unit overview indicates how inputs are being grouped into batch units when launching a workflow. It is described in Running workflows in batch mode.