Iterate
Adding an Iterate element to the top of a workflow causes the workflow branch below it to be run once for each batch unit. A given batch unit iteration runs until the end of the workflow, or until a Collect and Distribute element is encountered.
If desired, sections downstream of a Collect and Distribute element can also be run once per batch unit by adding another Iterate element, just after the Collect and Distribute element. The composition of batch units at this point in the workflow can be adjusted as desired.
Note: A single Iterate element at the top of a workflow without any downstream Collect and Distribute element is equivalent to checking the Batch button when launching the workflow. In such cases, having a simple workflow design, without control flow elements, and checking the Batch button when launching is preferable.
For workflows with a single workflow input element (green box) that contain a single Iterate element, batch units can be defined based on the location of the input data or based on a information in a metadata table. For any workflow containing more than one Iterate element, or where there is a single Iterate element and the Batch button is checked when starting the workflow, batch units must be defined using information in a metadata table that the input data has been associated with.
For information on creating metadata tables and associating data with them, see Metadata.
Configuring an Iterate element
The configuration options available for an Iterate element are shown in figure 11.44. They are:
- Number of coupled inputs The number of separate inputs for each given iteration. These inputs are "coupled" in the sense that, for a given iteration, particular inputs are used together. For example, when sets of sample reads should be mapped in the same way, but each set should be mapped to a particular reference.
- Error handling Specify what should happen if an error is encountered. The default is that the workflow should stop on any error. The alternative is to continue running the workflow if possible, potentially allowing later batches to be analyzed even if an earlier one fails.
- Metadata table columns If the workflow is always run with metadata tables that have the same column structure, then it can be useful to set the value of the column titles here, so the workflow wizard will preselect them. The column titles must be specified in the same order as shown in the worfklow wizard when running the workflow. Locking this parameter to a fixed value (i.e. not blank) will require the definition of batch units to be based on metadata. Locking this parameter to a blank value requires the definition of batch units to be based on the organization of input data (and not metadata).
- Primary input If the number of coupled inputs is two or more, then the primary input (used to define the batch units) can be configured using this parameter.
Figure 11.44: The number of coupled inputs in this simple example is 2, allowing each set of sample reads to be mapped to a paticular reference, rather than using the same reference for all iterations.