Reference data for analyses on the cloud

When running workflows configured to use QIAGEN reference data, no transfer of that data will take place, and there is no need to explicitly copy it to your own AWS S3 account. Under certain conditions, no local copy of this reference data is needed either. This is described in more detail later in this section.

In cases where other reference data will be used, the following steps are recommended:

Image wf-input-element-wizard-diffs
Figure 6.2: Using the workflow on the left, data for the References field can only be selected from a CLC Location. The workflow on the right has an Input element connected to the References input channel. Using that workflow, files can be selected from an AWS S3 bucket, or from other accessible places, including CLC Locations.

Image ref-data-workflow-input-element
Figure 6.3: An Input element is connected to the References input channel. On-the-fly import of a CLC format file has been specified by selected the "Select files for import" option and "CLC Format" from the drop-down list of formats. The relevant AWS Connection has been selected from the drop-down list of locations. A CLC file was then selected for use as the reference genome.

QIAGEN reference data in workflows

QIAGEN reference data elements6.1 are already present in AWS S3 (figure 6.3) and thus do not need to be uploaded to your own S3 bucket when running workflows configured to refer to them. This includes many of the template workflows delivered with the software, and thus also workflows derived from those.

When the conditions listed below are met, there is also no need for a local copy of QIAGEN reference data when launching workflows to run on the cloud.

Note: To view track lists that refer to reference data elements, those elements must be available locally.

Image ref-data-launch-on-wb-vs-cloud
Figure 6.4: QIAGEN Reference Sets do not need to be available locally (left hand image) for them to be available when launching a workflow to run on the cloud (right hand image).



Footnotes

... elements6.1
"QIAGEN reference data" refers to data sets or data elements provided by QIAGEN, available from under the QIAGEN references tab of the Reference Data Manager. Template workflows, delivered with the software, are commonly configured to use QIAGEN reference data.