External command
The 3 sections under the External command editor tab, as shown in figure 12.3, are:
- External application name
- The name that end users see via the CLC Workbench Toolbox menu or use to launch the external application using the CLC Server Command Line Tools.
- Command line
- The command line to be run, including the full path to the application and its parameters. Parameters and values that will not be configurable by end users are written as normal. Parameters values that should be substituted at run time are written within {curly brackets}. This includes parameters that should be configurable by the end user.
- General configuration
- Parameter values specified in {curly brackets} in the Command line field have a corresponding entry in the General configuration area. Here, those values are configured, including specifying their type, and for some types, configuring the values to be applied or offered to end users to select.
For illustration, the simple example of the cp (copy) command with 2 positional parameters is shown in figure 12.3. In the General configuration area, the Sequences to copy parameter is set to User-selected input data (CLC data location) meaning that the end user will specify the data to be copied from a CLC File Location. That data will be exported using the fasta format. The Copied sequences parameter is set to type Output file from CL, indicating that this is the output from the command, and the standard fasta importer was selected for importing the results into the CLC Server.
Command parameter value types
Details of parameter value types are outlined below. A brief description is also provided in the web administrative interface when a value type is selected and the mouse cursor is hovered over it. Particularly important types for external application configurations are User-selected input data (CLC data location), which is the usual choice for parameters specifying input data, and Output file from CL, which is the usual choice for specifying results generated by the command line application.
- Text - The end user can provide text that will be substituted into the command at runtime. A default value can be configured.
- Integer - The end user can provide a whole number that will be substituted into the command at runtime. A default value can be configured. If no value is set, then 0 is the default used.
- Double - The end user can provide a number that will be substituted into the command at runtime. A default value can be configured. If no value is set, then 0 is the default used.
- Boolean text - A checkbox is shown in the Workbench wizard interface. If the user checks the box, the given text will be substituted into the command at runtime. If the box is unchecked, this means that no value will be substituted.
- CSV enum - A drop down list is presented to a Workbench end user, from which they can choose a desired option. A corresponding value that will be substituted into the command at runtime. To configure this parameter type, enter a comma delimited list of the values to be substituted at runtime into the first box, and a comma delimited list of corresponding labels to display to end users in the second box. Each entry in a given list should be unique and the two lists should be of equal length.
For an example of this, please see Example: Velvet integration on setting up Velvet as an external application.
- User-selected input data (CLC data location) - The end user should specify one or more input files from those stored on the CLC Server. In the General configuration area, the appropriate exporter should be selected, so that the format of the data is will be as needed for the command line application. Each exporter can be configured further by clicking on the Edit parameters button, as shown in figure 12.312.2. Choices to make when configuring export parameters include:
- Default values to be applied when the external application is run. To edit fields that are locked by default, click on the symbol of the lock image to open the lock. Once unlocked, changes can be made.
- Which parameters end users will be able to configure when launching the external application. A parameter with an unlocked symbol beside it will be displayed to the end user and its value will be editable. Locked parameters are not shown and cannot be changed by end users.
- User-selected files (Import/Export directory) - The end user should specify one or more input files stored in an Import/Export area configured on the CLC Server. This option is used to specify files on the server machine that should be used. These are typically not CLC files. These files can be configured so they are pre-selected for the end user, but the end user can deselect preconfigured files when launching the external application.
- Output file from CL - This option should be used for the parameter that defines the output of the external command line application. Once selected, a drop down list appears, where how that output should be handled is specified :
- Where results should not be imported into the CLC Server, choose the option No standard import or map to high throughput sequencing importer.
- To import results into the CLC Server using a high throughtput sequencing (NGS) importer, choose the option No standard import or map to high throughput sequencing importer and then configure the importer to use under the High-throughput sequencing import / Post processing tab, described in High throughput sequencing importers and post processing tools.
- To import the results using a standard importer, choose the importer to use from the drop down list presented. If the import type Automatic is selected, the importer used is determined by the filename suffix in combination with a check of the format of the elements in the file. If the file type is not recognized, it will be imported as an external file. A list of file formats, including the expected filename suffix for each format, can be found in the appendix of the CLC Genomics Workbench manual:
Read more about search here:
http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Local_search.html.
The third, empty field can be used to enter the name of the file the external process is expected to produce. If left blank, the base name of the file produced by the command line tool will be used as the base name for the data element imported into the CLC Server. Specifying a default filename in the third field, including the relevant suffix (e.g. .fasta, .xlsx), is recommended.
When Output file from CL is selected for at least one value, the end user will need to provide a location on the CLC Server to store results. This will be the case even if the output of the external file will not be imported, as log files will still be written to the location selected.
- File - The end user should specify input files from their local machine. These are typically not CLC files. The CLC Server must be configured to allow direct data transfer from client systems for this option to be usable. If it is not, the parameter will not be configurable by the end user and they will see a message saying server upload is disabled when they try to launch the external application.
- Context substitute - The options are:
- CPU limit max cores The core limit defined for the server that executes the command will be substituted.
- Name of user The name of the user who launched the external application will be substituted.
- Boolean compound - This enables the creation of a checkbox. If checked, the end user is presented with another option as configured by the administrator. If not checked, the option associated with the checkbox is grayed out. Whether the box is checked or unchecked by default can be configured.
Figure 12.4: Clicking on the Edit parameters button for the "Sequences to copy" parameter brings up a window with the editable parameters for the selected exporter. Parameters with a locked symbol beside them are not shown to, and are thus not configurable by, the end user.
A tip for exploring how many files an exporter will generate
A simple way to explore how many files an exporter will generate with a given configuration is to set up an external application using the echo command and a single parameter linked to the exporter of interest. Set up the Standard out handling to Plain text. This is described in Stream handling. The output from such an external application is a file, which is re-imported into the CLC Server as a text file. This file contains the full paths to the files the exporter created.
If an exporter is configured in a way that will lead to multiple output files, then the full path to each output file will be substituted in the command at runtime. The external application itself must be able to handle the outputs generated.
Footnotes
- ...fig:extaptsimpleconfig112.2
- Configurable export parameters were introduced with CLC Genomics Server 10.0.