Batch launching workflows with multiple inputs

This section describes the launching of workflows with multiple inputs, where all input elements will be changed per batch. This launch mechanism is not intended for workflows with multiple input elements where one of the input elements remains the same in all batches, such as workflows meant to compare several tissues to a unique control tissue. At the moment, batch launching of such workflows is not possible, unless the common item is saved under different names as many times as there should be batches.

For workflows with multiple inputs where the inputs all need to change for each batch run, information specifying the grouping of the data elements and what role each element plays in a given analysis needs to be imported into the system from an Excel spreadsheet.

The requirements for launching such workflows in batch mode are:

(Figure 8.8) shows an example of a spreadsheet used in the case of tissue comparison. Note that the "grouping" and "type" are context specific, and will depend on the analysis performed, i.e., on the tools that constitute the workflow.

Image spreadsheet_batching
Figure 8.8: Example of a spreadhseet necessary to run a workflow in batch, where the workflow intend to compare two tissue samples.

To launch a workflow with multiple input elements in batch mode, right click on the name of the workflow in the Toolbox and select the option "Run in Batch Mode..." (figure 8.9).

Image launch_multiinput_workflow_gx
Figure 8.9: The option to "Run in Batch Mode..." appears in the context menu when you right click on the name of an installed workflow that has multiple input elements in the Toolbox panel.

A wizard opens and in the first window, you need to specify:

An icon with a green check mark (Image green_checkmark_16_h_p) appears in the table preview next to rows where a data element corresponding to a row of the Excel sheet was uniquely identified. If no match can be made to a given row of the Excel sheet, a question mark (Image question_mark_16_d_p) is displayed.

Graphical symbols are also presented in the header of the first column of the preview pane to give information about the overall status of the matching of rows in the Excel sheet with data elements in the Workbench:

In figure 8.12, the green check mark symbol in the header of the first column in the preview pane indicates that data elements were identified for each of the rows in the Excel sheet. You can click on the button labeled "Next".

Image select_data_gx1
Figure 8.12: View of the Data Association table after all samples were successfully associated.

The next wizard window is called "Select grouping parameters and analysis inputs".

In the same window you will need to further specify the inputs of the workflow. What needs to be specified here is dependant on the workflow itself.

An example is shown in figure 8.13. Group by is set to a column specifying "Patient ID", because each workflow run will analyze a sample pair. Type is set to the "Type" column, because the workflow inputs are either tumor or normal tissues. The sample columns section maps data elements to the different workflow inputs, in this case "Tissue sample" is set to "Tumor", and "Control tissue sample" to "Normal".

Image specify_group_and_type_gx
Figure 8.13: Grouping samples.

The rest of the wizard is dependant of the tools included in the workflow. Fill in the appropriate information and save the results of your workflow in a folder you can create in the Navigation Area.

As in a regular batching mode, you can use the progress bar to see how the job is progressing (figure 8.14): a process called "Batch Process" indicates how many batches have been completed, while the ones situated above show the analysis progress of a particular batch unit.

Image checkprocesses
Figure 8.14: Check on the progress of your workflow being run in batch mode using the Processes tab below the Toolbox.