Result files and connecting analyses in pipelines

For each run of clcserver, text information is returned providing a summary of the steps taken, and the locations, in ID form, of any files generated. The file containing this information will, by default, be created in the current directory and will be called result.txt. You can use the -O option for the clcserver command if you wish to specify an alternative file to be written to.

An example of contents in a typical result file is shown below. In this case the file that was generated after running the the trim algorithm using a sequence list called reads as input. The result file lists the three files that were produced.

//
Name: reads trimmed
ClcUrl: clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//
//
Name: reads report
ClcUrl: clc://127.0.0.1:7777/-268177574-ADAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//
//
Name: Trim Sequences log
ClcUrl: clc://127.0.0.1:7777/-268177574-CAAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
//

When creating pipelines stitching together several analyses, you parse the result file to get the location of the data produced, which is needed as input for the next algorithm.

The result file is just a text file, but it can still be a challenge to parse it to get the necessary CLC URLs. Thus, we provide a tool called clc_result_parser to help with this. It searches the result file for a text expression you provide, and returns the CLC URL for files where a match to that text has been found in the Name: field. The Name field will contain the name of the input data along with a description of the type of data held in that file location.

In the case above, you would probably search for the trimmed reads to use for further analysis, which could be done with a command like this:

    clcresultparser -f result.txt -c trimmed
Here, the following text would be returned:

    clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff

The options for the clcresultparser program are:

-f <name of result file to parse>
This option is required.
-c <text to search for>
Text to search for in the Name field of the result file. If nothing is found, the exit code is 1.
-n <text that should not match>
Text that should not be contained in the Name field of the result file.
-r <regexp>
A Java regular expression used for matching the name of the output (see http://java.sun.com/docs/books/tutorial/essential/regex/index.html).
- -ignorelogs <boolean>
By default, all analyses produce log files. You can provide false as the argument to this option to stop log files from being returned. This is equivalent to excluding all names ending with log, or log with a number suffix. The latter are generated when there is more than one log file in the same folder.
-p <prefix text>
When more than one match is found, the data locations for all matches will be output as a space-separated list. By supplying a prefix string, you can stipulate what character(s) to separate the list using. E.g. If you need to send several files output by the clcresultparser command as arguments to -i options for the next analysis, simply provide "-i" as as the argument for the -p flag.
-e <integer>
The number of CLC URLs that are expected to be returned. If this is not the number of results files that match the search string. the command will return with exit code 10. This option is designed for use in scripts where you will wish to carry out validation steps are you proceed through the pipeline. (On the command line, you check the error code returned by the previous command by typing echo $?.
-C <integer>
Specifies the column width of the help output.