Result files and connecting analyses in pipelines
For each run of the clcserver
command, a summary of the steps taken and the locations of the results, in ID form, are returned to stdout. By adding -O <filename>
to the clcserver
command, the locations of the results can also be written to a file.
The typical contents of such a file is shown in the example below. Here, the trim algorithm had been run on a sequence list called reads
. Three data elements were generated as output, and their names and locations were written to the file specified on the command line using the -O
option.
// Name: reads trimmed ClcUrl: clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff // // Name: reads report ClcUrl: clc://127.0.0.1:7777/-268177574-ADAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff // // Name: Trim Read log ClcUrl: clc://127.0.0.1:7777/-268177574-CAAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff //
When creating pipelines of analyses, you would typically parse such a file for the locations of outputs to be used as inputs for downstream analyses. The clc_result_parser
tool is provided to help with this. This tool will search the Name:
fields of a file like the one shown above for an expression supplied on the command line. It returns the locations of data elements where a match was found.
For example, if the file shown above was called results.txt
, the location of the trimmed reads output could be obtained by running this command:
clcresultparser -f result.txt -c trimmedHere, the following text would be returned:
clc://127.0.0.1:7777/-268177574-YCAAAAAAAAAAAAPc673b0db8c5e724f--5d66a991-12d75090d93--7fff
Below is a list of the available parameters for the clcresultparser
program. The parameters are also listed if this tool is run without any arguments.
- -f <name of result file to parse>
- This option is required.
- -c <text to search for>
- Text to search for in the
Name
field of the result file. If nothing is found, the exit code is 1. - -n <text that should not match>
- Text that should not be contained in the
Name
field of the result file. - -r <regexp>
- A Java regular expression used for matching the name of the output (see http://java.sun.com/docs/books/tutorial/essential/regex/index.html).
- - -ignorelogs <boolean>
- By default, all analyses produce log files. You can provide
false
as the argument to this option to stop log files from being returned. This is equivalent to excluding all names ending withlog
, orlog
with a number suffix. The latter are generated when there is more than one log file in the same folder. - -p <prefix text>
- When more than one match is found, the data locations for all matches will be output as a space-separated list. By supplying a prefix string, you can stipulate what character(s) to separate the list using. E.g. If you need to send several files output by the
clcresultparser
command as arguments to-i
options for the next analysis, simply provide "-i" as as the argument for the-p
flag. - -e <integer>
- The number of CLC URLs that are expected to be returned. If this is not the number of results files that match the search string. the command will return with exit code 10. This option is designed for use in scripts where you will wish to carry out validation steps are you proceed through the pipeline. (On the command line, you check the error code returned by the previous command by typing
echo $?
. - -C <integer>
- Specifies the column width of the help output.