Submitting workflows to the cloud using the CLC Server Command Line Tools
Submitting workflows to the cloud
Workflows installed on a CLC Genomics Server can be submitted to run in the cloud using the CLC Server Command Line Tools. Information about this is provided in the CLC Server Command Line Tools manual
The manual page linked above focuses on submitting jobs for execution on a CLC Genomics Server, but most of the information also applies to workflows submitted for execution on the cloud. The remainder of this section provides cloud-specific details.
Specifying a cloud preset
The cloud preset to use must be specified when submitting workflows to run on the cloud. The name of the cloud preset should be supplied as the value for the -L option.
To see the list of cloud presets available, run the clcserver command with no arguments or with an incomplete set of arguments.
Information about configuring cloud presets on a CLC Server is in Configuring cloud presets.
Specifying input data for analyses
CLC format data in CLC Server File System Locations or in remote locations accessible via http, https, or S3 URL can be provided as inputs to workflows4.1.
Data in other formats can be supplied as input by using on-the-fly import. For example, using on-the-fly import, FASTQ sequence files would be imported as the first step in the workflow, avoiding the need for running a specific import command before running the workflow.
See also the general information about input data for cloud analyses and the information about providing input data to analyses using the CLC Server Command Line Tools.
Specifying where results should be saved
Results generated using workflows run on the cloud are saved to AWS S3. The location to save results to is specified using an S3 URL as the value for the relevant parameter.
See also the general information about specifying where results should be saved when using the CLC Server Command Line Tools.
Accessing AWS CloudWatch logs via the command line
The CLC Server Command Line Tools command -A cgc_read_aws_logs supports the retrieval of AWS CloudWatch logs for jobs run on a CLC Genomics Cloud.
The messages returned from jobs run on the cloud include the information needed to access the AWS CloudWatch log for that job. The AWS CloudWatch information retrieved is the same as that returned when the "Execution Log" is opened in the CLC Workbench, either via the Processes tab or via options under the Remote Files tab, as described in Accessing results from the Processes tab.
Footnotes
- ... workflows4.1
- Support for http, https and S3 URLs for directly specifying files in remote locations, i.e. not needing to specify the location as a clccloudfile, was introduced in CLC Genomics Server 23.0.3 and Cloud Server Plugin 23.0.1, as was the ability to supply CLC format files in remote locations directly as input, without needing to use on-the-fly import.
