Selective download of results

Individual results can be downloaded using the Workflow Result Metadata table, which is generated for each batch. This is particularly useful when the amount of data generated by a workflow execution is so large that it is not practical to download all the results at once.

In the Cloud Job Search table, download the Workflow Result Metadata table for selected jobs by clicking on the Download Metadata button.

In a Workflow Result Metadata table, each workflow output is shown in a separate row. The path to the output in the AWS S3 location is indicated in the "External path" column (figure 6.5). When one or more rows in the table are selected, the Find Associated Data button is enabled. Clicking on this button opens a Metadata Elements table, where the results are listed. The path to each data element is provided here. Paths starting with s3:// are those stored in AWS S3. When such rows are highlighted, the "Download" button at the bottom of that table is enabled.

Image result_not_yet_downloaded
Figure 6.5: The "External path" column in the Workflow Result Metadata table shows the path to the data in AWS S3. After the "Find Associated Data" button is clicked, a Metadata Elements table is opened, listing the associated data elements and their locations.

To download particular data elements from the cloud, select the desired rows in the Metadata Elements table, and click on the Download button. Data elements downloaded are placed in the subfolder (if any) that was specified by the workflow design for the given output. New folders are created as required. Therefore, we recommend selecting the same folder to store different results for the same workflow. This ensures that the outputs are organized in the same folder structure that they would have been if the workflow had been executed by the CLC Workbench.

When a data element is downloaded, it is automatically associated with the relevant row of the Workflow Result Metadata table, and can be found by clicking on the Find Associated Data button again (figure 6.6). External references will not be removed from the Workflow Result Metadata table by downloading a result. Results can be downloaded this way any number of times.

Image result_already_downloaded
Figure 6.6: When the results have been downloaded, they are automatically associated with the relevant row of the Workflow Result Metadata table, and can be located by pressing the "Find Associated Data" button again.

A connection to the CLC Genomics Cloud Engine is required to find the Workflow Result Metadata table and download it using the Cloud Job Search functionality. However, only the AWS S3 connection is required for downloading results via the Workflow Result Metadata table.

Thus, by saving the Workflow Result Metadata table, you can download the results from AWS S3 at a later point, as long as the data is still available in AWS S3. You do not need a connection to the CLC Genomics Cloud Engine or to use the Cloud Job Search for this activity.

Note: Exported data cannot be downloaded selectively. To download data exported by the workflow, you must use the Download All Results button in the Cloud Job Search tool. See section Downloading all results for further details.