Working with AWS S3 using the Remote Files tab
Browsing data in AWS S3 using the Remote Files tab
Available AWS S3 buckets are listed under the Remote Files tab, to the right of the Navigation Area tab (figure 3.6).
If AWS S3 buckets are available from more than one source, those sources are listed in a drop-down menu. Possible sources are AWS Connections or public S3 buckets configured in the CLC Workbench, or configured in a CLC Server that the Workbench is connected to. A server icon () is shown beside sources available via a CLC Server (figure 3.7).
Figure 3.6: AWS S3 buckets you have access to are available under the Remote Files tab.
Figure 3.7: This Workbench has a valid AWS Connection and is connected to a CLC Server with a valid AWS Connection. At least one public S3 bucket has been configured in the Workbench and in the CLC Server. The S3 buckets available from the selected source are listed in the Remote Files tab.
Uploading data to AWS S3 using the Remote Files tab
To upload data from your Navigation Area to AWS S3, right-click on a folder in the Remote Files tab and choose the option Upload to this folder (figure 3.8).
Figure 3.8: To upload data from your Navigation Area to AWS S3, open the Remote Files tab, right-click on the folder you wish to upload data to, and select the option "Upload to This Folder".
Upload is sequential. Information about the data upload is shown in the Processes tab, at the bottom left of the Workbench (figure 3.9).
Figure 3.9: After choosing to upload data to S3, the progress of the upload is reported in the Processes tab.
Downloading results via the Remote Files tab
Right-click on an element or elements under the Remote Files tab to download from AWS S3 (figure 3.10). When only a file or files are selected, options to Download and Open and Download and Save will be available. When a folder has been selected, only the Download and Save is available.
Figure 3.10: Select folders and/or files in the Remote Files tab and right-click to reveal options for downloading that data.
After choosing to download the data, you can choose to download it to a CLC File Location or to another area on your system.
To see all the outputs of a particular job that has been run on a CLC Genomics Cloud setup, double-click on a workflow-result.json
file. All the results can then be downloaded and opened from that list, or individual elements can be selected and downloaded. The Execution Log is also available from this list (see figure 3.11).
Figure 3.11: Double-click on a workflow-result.json file in the Remote Files tab in the Workbench to reveal a list of all results from a job run in the cloud, as well as the Execution Log. All items can be downloaded and opened from this menu, or individual items can be selected and downloaded.
If the Navigation Tools plugin is installed, bookmarks for items in the Remote Files tab can be made. Double-clicking on bookmarks for individual results files or folders opens the bookmarked items, as standard. Double-clicking on a bookmark for a workflow-result.json file reveals the same list of options as double-clicking on the workflow-result.json file in the Remote Files tab directly. Further details about bookmarks are provided in the Navigation Tools manual at https://resources.qiagenbioinformatics.com/manuals/navigationtools/current/index.php?manual=Introduction.html.
Note: AWS charges for downloading data from S3. By default, when the download size exceeds 1 GB, you are prompted for confirmation that you wish to proceed. The size required to trigger this warning can be changed in the General section of the Workbench Preferences (figure 3.12).
Figure 3.12: The download size above which a cost warning dialog is shown can be adjusted in the Workbench Preferences. The default value is 1000 MB.
Downloading data using a URL
Pasting an URL into the Workbench Navigation Area will import the files, or a folder containing files, using Standard Import. File types are automatically detected. Thus, as well as CLC format files being available to view and work further with in the Workbench, other files in formats recognized by Standard Import will be imported as CLC format files, allowing them to be worked with in the Workbench. Further details about Standard Import are provided at: https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Standard_import.html.