Submitting analyses to AWS from a Workbench

Once the prerequisites described in the overview section are in place, the CLC Genomics Cloud option will be enabled in the launch wizards of CLC workflows and tools3.1. The available AWS Batch queues are listed in a drop-down menu (figure 3.4). After selecting a queue, mouse over its name to see information about the settings for that queue (figure 3.5).

Image wb-select-cloud-queue
Figure 3.4: To submit jobs to the cloud, select the CLC Genomics Cloud option in the launch wizard and then select from the list of available AWS Batch queues in the drop-down menu.

Image mouse-over-queue-info
Figure 3.5: When the mouse cursor hovers over the selected queue, details about the computational resources associated with that queue are shown in a tooltip.

Select a folder in AWS S3 for results that has not previously been used to store results. When a non-empty folder is selected, a warning is shown. Files of the same name as files created by the analysis will be overwritten. This includes the workflow-result.json file, which contains a list of the results generated by a job and links to the Execution Log (described in Monitoring and reviewing CLC cloud jobs). If the workflow-result.json file is overwritten, functionality in CLC software for finding results for a given job will not work, even though the relevant files may be present and locatable on AWS S3, for example using the Remote Files tab.

Ensure that all processes for cloud jobs have proceeded beyond transferring job information and data before closing the Workbench. If closed earlier, the job will fail. When input data is already on S3, this phase is very quick as only information about the job is being sent to AWS. If the input data is not on AWS S3, the first activity after launching the analysis will be to transfer the data to AWS S3, so this phase can take some time. See General information about input data when launching cloud analyses for further details.

Software version used in the cloud When submitting jobs from a CLC Workbench, the version of the CLC software used in the cloud is the same as the CLC Workbench version. When submitting jobs that include tools provided by a plugin, the plugin version used in the cloud job is the same as the version installed on the CLC Workbench used to submit the job. See the appendix for further details.



Footnotes

... tools3.1
Only tools that can be used within workflows can be submitted to run on a CLC Genomics Cloud. Tools that are not workflow-enabled cannot be run on the cloud.