Submitting workflows using the CLC Server Command Line Tools

Submitting workflows to the cloud

Workflows installed on a a CLC Genomics Server can be submitted to run in the cloud using the CLC Server Command Line Tools. Launching workflows is described at http://resources.qiagenbioinformatics.com/manuals/clcservercommandlinetools/current/index.php?manual=Executing_workflows.html

The cloud preset to use is specified using the -L option. To see the list of cloud presets available, run the clcserver command with no arguments or with an incomplete set of arguments, as described on http://resources.qiagenbioinformatics.com/manuals/clcservercommandlinetools/current/index.php?manual=Basic_usage.html.

Specifying inputs and the destination for results

Inputs for analyses to be run on the cloud can be in CLC Server file locations or in remote locations. Input data files in an AWS S3 bucket are specified using an s3 URL 4.1.

Output destinations should be specified using an s3 URL.

Data in CLC Server file locations and CLC format files in remote locations can be specified directly as input. That is, on-the-fly import is not needed in these cases4.2.

Data in supported formats other than CLC format require the use of on-the-fly import. The precise command options depend on the importer to be used. To reveal the importers available for use with a given workflow,run the clcserver command with the name of the workflow specified using the -A option. For example,

clcserver -S <server> -U <username> -W <password or token> -A <workflow name>

One of the options returned will usually relate to on-the-fly import. E.g. for some template workflows, this would be the --reads-import-command, with a list of the available importers. An example of one of these is ngs_import_illumina. Running an incomplete command of this form would then reveal the on-the-fly import options relevant for that importer:

clcserver -S <server> -U <username> -W <password or token> \\
-A <workflow name> --reads-import-command ngs_import_illumina

Accessing AWS CloudWatch logs via the command line

The CLC Server Command Line Tools command -A cgc_read_aws_logs supports the retrieval of AWS CloudWatch logs for jobs run on a CLC Genomics Cloud.

The messages returned from jobs run on the cloud include the information needed to access the AWS CloudWatch log for that job. The AWS CloudWatch information retrieved is the same as that returned when the "Execution Log" is opened in the CLC Workbench, either via the Processes tab or via options under the Remote Files tab, as described in Accessing results from the Processes tab.



Footnotes

... URL4.1
Support of URLs to remote locations without specifying the location as a clccloudfile was introduced in CLC Genomics Server 23.0.3.
... cases4.2
Support for supplying CLC format files in remote locations directly as input to workflows to be executed on the cloud was introduced in CLC Genomics Server 23.0.3 and Cloud Server Plugin 23.0.1. With earlier versions, CLC format data in remote locations, like other data formats, had to be specified using on-the-fly import.