External applications
Non-interactive command line applications and scripts can be integrated into the CLC environment by configuring them as external applications. Once configured and installed, external applications are available to launch via the graphical menu system of a CLC Workbench or via the CLC Server Command Line Tools. External applications can also be included in workflows, allowing complex, reproducible analyses involving CLC tools and other, non-CLC tools to be easily configured and launched by end users.
There are two types of external application:
- Standard external applications, where a command line application is run directly on the server system.
- Containerized external applications, where an application is executed from within a Docker container.
External applications can be run directly, or as part of a workflow, on a CLC Server. Note that containerized external applications are supported on Linux-based CLC Server setups only. Workflows containing external applications can also be run on a CLC Genomics Cloud setup on AWS. Documentation about this is provided in the CLC Cloud Module manual at https://resources.qiagenbioinformatics.com/manuals/clccloudmodule/current/index.php?manual=Integrating_third_party_tools_into_CLC_software.html.
Figure 16.1 shows an overview of the actions and data flow that occur when a standard external application is launched on the CLC Server via a CLC Workbench. The data flow can be summarized as follows:
- An end user specifies input data and sets values for parameters when launching the external application from a CLC Workbench or CLC Server Command Line Tools.
- The CLC Server exports the input data from the CLC Server to a temporary file.
- The CLC Server launches the underlying application (standard external applications) or the container containing the underlying application (containerized external applications), and provides the parameter values specified by the user and the input data from the temporary file. The underlying tool or container runs as a separate process to the CLC Server. That process is owned by the same user that owns the CLC Server process.
- When the command line application has finished, results are handled as specified in the external application configuration. Usually this involves results being imported into the CLC environment and saved to the location specified by the user when the external application was launched.
- Results imported into the CLC environment are available for viewing and further analysis, for example using a CLC Workbench.
Temporary files are created outside the CLC environment during the execution of third party (non-CLC) tools and are deleted after the process completes.
Figure 16.1: An overview of the flow of a standard external application when launched from a CLC Genomics Workbench.
Key information about external applications
- The CLC Server software should be run by an unprivileged user. Like other CLC Server tasks, external applications processes are owned by the same logical user that owns the CLC Server process itself. If the system's root user is running the CLC Server process, then tasks run via the External Applications functionality will also be executed by the root system user. This is usually undesirable.
- For a standard external application, the underlying tool must be available on all the systems where an external application can be run by the CLC Server.
- For containerized external applications, it is sufficient that the containerized execution environment is configured on all the systems where the external application can be run.
- A folder called "External Applications" appears in the CLC Workbench Tools menu when a CLC Workbench is connected to a CLC Server with available external applications.
- Updates to existing external application configurations are registered in the CLC Workbench during a single login session. To discover new external applications, or see updates to existing ones, from a client application, you must log out of and back into the CLC Server.
- As soon as a new, enabled application is saved, it becomes available for use by client software the next time it connects to the CLC Server. External applications can be disabled, so that they are not available to use.
- By default, all users can use enabled external applications. Limiting access to certain groups is described in Controlling access to the server, server tasks and external data.
- The CLC Server administrator, or members of groups given the relevant permissions, can configure external applications, including controlling whether they are enabled or disabled. Granting permission to configure external applications to members of a non-administrative group is described in Web admin access.
Subsections
- External application configurations
- Configuring external applications
- Using consistent reference data in external applications
- Import and export of external application configurations
- Updating external application configurations
- Example: Velvet (standard external application)
- Example: Bowtie (standard external application)
- Example: Kraken2 (containerized external application)
- Example: MAFFT (containerized external application)
- External applications in workflows
- Running external applications
- Troubleshooting external applications