To import data from and export data to Amazon S3, you need:
- An Amazon Web Services (AWS) account.
- An AWS Identity and Access Management (IAM) user with credentials for programmatic access (access key ID, secret access key). Such a user can be set up by the AWS account administrator in the AWS Management Console:
Services IAM Users Add User
Programmatic access should be granted to the user when setting the AWS access type (figure 1.1). The IAM user only needs access to the S3 locations where GCE, the CLC Genomics Server, and the CLC Workbench should be able to access files. The IAM user does not need API access to anything else in AWS.
- The IAM User should be granted the following permission policy: AmazonS3FullAccess. A more limiting policy can be used if there is a need to restrict access to specific buckets in Amazon S3.
Figure 1.1: Enabling programmatic access for an AWS IAM user can be done in the AWS Management Console. Note: This screenshot is for illustration purposes only. Details may be changed at any time by AWS.
To import data from Illumina BaseSpace, you need an Illumina BaseSpace account.
To run workflows in the cloud, and to find jobs or to download results from jobs that have been run in the cloud, you also need:
- The CLC Genomics Cloud Engine deployed on your AWS account. Read more about the product and request a quote here: https://digitalinsights.qiagen.com/products-overview/discovery-insights-portfolio/enterprise-ngs-solutions/qiagen-clc-genomics-cloud-engine/.
- At least one Amazon S3 bucket set up to be used for caching uploaded files. The CLC Genomics Cloud Engine includes a tool to set up such a bucket.
- The following permission policy granted to the IAM User logging in from the CLC Workbench: AWSResourceGroupsReadOnlyAccess. This policy grants the user the rights to automatically find the cache bucket in Amazon S3.