Result files and connecting analyses in pipelines

For each run of the clcserver command, a summary of the steps taken and the locations of the results, in ID form, are returned to stdout. By adding -O <filename> to the clcserver command, the locations of the results can also be written to a file.

The typical contents of such a file is shown in the example below. Here, the trim algorithm had been run on a sequence list called reads. Three data elements were generated as output, and their names and locations were written to the file specified on the command line using the -O option.

//
Name: reads trimmed
ClcUrl: clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//
//
Name: reads report
ClcUrl: clc://127.0.0.1:7777/-268177574-ADAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//
//
Name: Trim Read log
ClcUrl: clc://127.0.0.1:7777/-268177574-CAAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//

When creating pipelines of analyses, you would typically parse such a file for the locations of outputs to be used as inputs for downstream analyses. The clc_result_parser tool is provided to help with this. This tool will search the Name: fields of a file like the one shown above for an expression supplied on the command line. It returns the locations of data elements where a match was found.

For example, if the file shown above was called results.txt, the location of the trimmed reads output could be obtained by running this command:

    clcresultparser -f result.txt -c trimmed

Here, the following text would be returned:

    clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff

Below is a list of the available parameters for the clcresultparser program. The parameters are also listed if this tool is run without any arguments.

-f <name of result file to parse>: This option is required.
-c <text to search for>: Text to search for in the Name field of the result file. If nothing is found, the exit code is 1.
-n <text that should not match>: Text that should not be contained in the Name field of the result file.
-r <regexp>: A Java regular expression used for matching the name of the output (see http://java.sun.com/docs/books/tutorial/essential/regex/index.html).
- -ignorelogs <boolean>: By default, all analyses produce log files. You can provide false as the argument to this option to stop log files from being returned. This is equivalent to excluding all names ending with log, or log with a number suffix. The latter are generated when there is more than one log file in the same folder.
-p <prefix text>: When more than one match is found, the data locations for all matches will be output as a space-separated list. By supplying a prefix string, you can stipulate what character(s) to separate the list using. E.g. If you need to send several files output by the clcresultparser command as arguments to -i options for the next analysis, simply provide "-i" as as the argument for the -p flag.
-e <integer>: The number of CLC URLs that are expected to be returned. If this is not the number of results files that match the search string. the command will return with exit code 10. This option is designed for use in scripts where you will wish to carry out validation steps are you proceed through the pipeline. (On the command line, you check the error code returned by the previous command by typing echo $?.
-C <integer>: Specifies the column width of the help output.

Browse the manual

Result files and connecting analyses in pipelines