Prerequisites
Requirements for importing data from Illumina BaseSpace
To import data from Illumina BaseSpace, you need an Illumina BaseSpace account.
Requirements for importing and exporting data to Amazon S3
To import data from and export data to Amazon S3, you need:
- An Amazon Web Services (AWS) account.
- An AWS Identity and Access Management (IAM) user with credentials for programmatic access (access key ID, secret access key). Such a user can be set up by the AWS account administrator in the AWS Management Console:
ServicesIAM
Users
Add User
Programmatic access should be granted to the user when setting the AWS access type (figure 1.1). The IAM user only needs access to the S3 locations where GCE, the CLC Genomics Server, and the CLC Workbench should be able to access files. The IAM user does not need API access to anything else in AWS. - The IAM User should be granted the following permission policy: AmazonS3FullAccess. A more limiting policy can be used if there is a need to restrict access to specific buckets in Amazon S3.
Figure 1.1: Enabling programmatic access for an AWS IAM user can be done in the AWS Management Console. Note: This screenshot is for illustration purposes only. Details may be changed at any time by AWS.
Requirements for running workflows on a CLC Genomics Cloud Engine
To run workflows on a CLC Genomics Cloud Engine (GCE), you need:
- The CLC Genomics Cloud Engine deployed on your AWS account. Read more about the product and request a quote here: https://digitalinsights.qiagen.com/products-overview/discovery-insights-portfolio/enterprise-ngs-solutions/qiagen-clc-genomics-cloud-engine/.
- At least one Amazon S3 bucket set up to be used for caching uploaded files. The CLC Genomics Cloud Engine includes a tool to set up such a bucket.
- The following permission policy granted to the IAM User logging in from the CLC Workbench: AWSResourceGroupsReadOnlyAccess. This policy grants the user the rights to automatically find the cache bucket in Amazon S3.