Setting up AWS resources

We recommend using the cgc-standard.json CloudFormation template, described below, to set up the standard resources needed for a CLC Genomics Cloud setup. Some configuration can be done when using this template. Additional AWS Batch queues can be added using another CloudFormation template afterwards. That template provides more configuration options (see Adding more AWS Batch queues for CLC jobs).

Below the description of the resources established using the cgc-standard.json CloudFormation template are detailed instructions of how to use the template, as well as information about the AWS IAM users created using the template.

Creating stacks using the AWS CloudFormation console is described in the AWS documentation at https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-create-stack.html.

Overview of standard CLC Genomics Cloud infrastructure on AWS

The cgc-standard.json CloudFormation template defines the resources needed for a CLC Genomics Cloud, which include:

Creating CLC Genomics Cloud infrastructure on AWS

To set up the standard infrastructure on AWS for handling CLC jobs:

AWS IAM users

When stack creation is complete, go to the Outputs tab of the main stack to find the credentials for the AWS IAM users created (figure 2.1).

Image aws-cgc-iam-users
Figure 2.1: The credentials for the AWS IAM users created using the CloudFormation template are listed under the Outputs tab for the stack.

AWS Connections using the "SubmitterUser" (CgcSubmitterUser-<EnvironmentId>) credentials allow CLC analyses to be submitted an AWS Batch queue for analysis. This user also has full access to AWS S3, read access to CloudWatch logs, and can list CloudFormation resources.

AWS Connections using the "BrowserUser" (CgcBrowserUser-<EnvironmentId>) credentials support listing S3 buckets and accessing bucket contents. Jobs cannot be submitted to run on AWS using these credentials.

AWS IAM user credentials are entered in AWS Connections in CLC Workbenches or the CLC Server, described in Configuring the AWS connection in the Workbench and Configuring the AWS Connection in the CLC Server, respectively.

The full policy for each user can be viewed in the Identity and Access Management (IAM) area of the AWS Console.

AWS S3 buckets for storing input data and results

One or more AWS S3 buckets must be created for holding input data and results. These buckets must be created in the same AWS account and region that the AWS Batch queues were established in. Please refer to AWS documentation for details:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html.

Note: The prefix cgc-system- should be considered reserved. Buckets given names starting with this term will not be visible in CLC Workbenches.