Running part of a workflow multiple times
To run a part of the workflow multiple times, enclose that part of the workflow between an Iterate and a Collect and Distribute element. For example, the workflow in figure 11.46 will allow you to run the RNA-Seq Analysis tool once per sample, and create a single combined report for the whole batch. All parts of the workflow that are downstream of an Iterate element will run multiple times, until a Collect and Distribute element is encountered. The parts of the workflow downstream of the Collect and Distribute element will be run only once.
Figure 11.46: In this workflow, the RNA-Seq analysis tool can be run once per sample, at the same time as creating a single combined report for the whole batch of samples.
When running on the server, the iterating parts of the workflow will run as separate jobs. This requires that the server is configured accordingly and job nodes or grid nodes are available.
When running the workflow, you can use a metadata table to specify the iteration (batch) units. After selecting all samples in the first step (with the "batch" checkbox NOT selected), you can specify which column in the metadata table defines how the samples should be grouped. In the example in figure 11.47, grouping by the column "ID" will result in the RNA-Seq Analysis tool being run 8 times, once for each sample. Selecting the "Gender" column instead would result in the RNA-Seq Analysis tool being run 2 times, once for each value in that column (male and female). In both cases, the Combine Reports tool will run only once for all samples.
Figure 11.47: With the current selection in the wizard, the RNA-Seq Analysis tool will run 8 times, once for each sample. The Combine Reports tool will only run once for all samples.
It is possible to rename the Iterate element, which will also change the text displayed in the wizard when the workflow is run. To do this, right-click the Iterate element, and choose Rename. The new name of the element will now be displayed in the wizard (figure 11.48).
Figure 11.48: The Iterate element can be renamed to change the text that is displayed in the wizard when running the workflow.
Control flow elements are described in more detail in Workflow control flow elements.
Defining batch units when using Demultiplex Reads
When Demultiplex Reads is used in a workflow, the Group Sequences output channel is connected to an Iterate element (figure 11.49). Batch units for the iterating section of the workflow that follows, (Trim Reads, in figure 11.49), can be defined based on information provided in the barcode file imported to Demultiplex Reads, rather than a separate metadata table. For this, the CSV or Excel format file needs to contain a column with the barcodes, a column with the sample names, and further columns, containing the relevant metadata.
Figure 11.49: The Group Sequences output channel of Demultiplex Reads connects to an Iterate element. The data to be analyzed together in the next workflow section, i.e. the batch units, can be defined using information from the barcode file or from a separate metadata table.