Configure a containerized MAFFT external application

In this section, we focus on how to determine the command to be configured for an individual containerized external application, and then step through configuring an external application for the Docker container containing MAFFT.

We assume that the containerized execution environment has already been enabled and configured, as described in Configuring the containerized execution environment.

Determining the containerized external application command line

When configuring containerized external applications, only the part of the docker command specific to this particular application is included in the Command line field of the external application configuration. Parts of the command general to running all containerized external applications are specified in the Containerized execution environment area, as described in Configuring the containerized execution environment.

The full command we want to run takes this form:

docker run -v <import-export-dir>:<mount-point-in-image> \ 
<image-identifier> <command-to-run-from-image> <input-data>

With default settings for the containerized execution environment, the first part,

docker run -v <import-export-dir>:<mount-point-in-image>

is already defined for every containerized external application.

Thus, only the rest of the command needs to be specified when configuring the individual external application, i.e.

<image-identifier> <command-to-run-from-image> <input-data>

For our MAFFT example, that command would have the form:

example/mafft:0.0.1 /mafft-linux64/mafft.bat <inputdata>

How to specify this command, including how to specify that users should able to select the data to be aligned when they launch the tool, is described in detail below.

Settings under the External command tab

To create a new external application, click on the New configuration... button under the External Applications Configurations section in the CLC Server web administrative interface.

We focus on settings under the "External command" and "Stream handling" tabs in the window that appears.

Configure the following under the External command tab:

Image external_app_mafft_wb_select_input
Figure 12.20: The external application name is presented as the name of the corresponding workflow element. Output channels and elements connected to them also reflect names specified in the external application configuration.

When a parameter is written into the Command line field in curly brackets, that parameter will be listed in the General configuration area below. There, the type of value expected for this parameter is configured.

We want users to select data to be aligned from a CLC location, and as MAFFT accepts data in FASTA format, we need to specify that the selected data should be exported in that format, so:

The exported file will be placed in the shared working directory configured for the containerized execution environment of the CLC Server, and the path to this exported file will be substituted into the docker command at runtime.

We now specify that the external application is containerized:

Figure 12.21 shows the configuration window after the above steps have been taken.

Image external_app_mafft_command_definition
Figure 12.21: Defining the MAFFT containerized external application: setting up the command.

Settings under the Stream handling tab

The MAFFT application produces its output on standard out and standard error, so we configure the result handling under the Stream handling tab, as shown in figure 12.22.

The file names specified for collecting information sent to standard out and standard error are used for the raw files that capture the contents of these streams, and their base names are seen by end users, as illustrated in 12.23 and 12.24 for this external application.

Image external_app_mafft_stream_handling
Figure 12.22: Defining the MAFFT containerized external application: output handling.

Image external_app_mafft_draw_workflow
Figure 12.23: The names entered in the external application configuration are used as the name of the corresponding workflow element, the names of the output channels and input channels, and the default names of output elements attached to the output channels.

Image external_app_mafft_outputnames
Figure 12.24: The external application was configured to generate output files named "MAFFT-alignment.fa" and "MAFFT-log.txt", which were then imported into the CLC Server, where those names are then reflected in the names of the imported data elements.

Save the external application

Click on the Save button at the bottom of the editor. By default, the external application will now be available directly under the "External Applications" menu of any CLC Workbench logged into this CLC Server.

If you want the external application to be listed in subfolder instead, go to the End user interface tab of the editor and specify a subfolder name there.

The configuration is now at a point where we can test this external application from a CLC Workbench or the CLC Server Command Line Tools. See Configuring external applications for further information about configuring external applications.

Launching the MAFFT external application from a CLC Workbench

From a CLC Workbench logged into the CLC Server, launch the external application directly by going to:

        Toolbox | External Applications (Image server_location_closed_16_n_p) | MAFFT (Image external_app_docker_enabled_16_n_p)

A wizard should appear. When you get to the step labeled "Enter parameters for the external application", you should see a field labeled "Sequences to align" (figure 12.25), reflecting the name given to that parameter in the external application command configuration.

Image external_app_mafft_wb_select_input
Figure 12.25: The Workbench user sees an option in the wizard named "Sequences to align". That label is taken from the external application command configuration.

To create a workflow that includes the MAFFT external application, open the Workflow Editor of a CLC Workbench logged into the CLC Server. The MAFFT element should be available to add from dialog that opens when you click on the "Add Element" button.

If the external application configuration has been exported to S3 from a CLC Server with the Cloud Server Plugin installed, then a CLC Workbench with the Cloud Plugin can also be used. See Import and export of external application configurations for further information about this aspect.

Note: To run external applications on a CLC Genomics Cloud Engine, they must be included in a workflow, and then that workflow submitted. To submit jobs to a CLC Genomics Cloud Engine from the CLC Server, it must have the Cloud Server Plugin installed. If submitting the workflow from a CLC Workbench, it must have the Cloud Plugin installed.