How to use the programs
The CLC Assembly Cell consists of standard command line tools, where the tool name is provided, followed by any flags or parameters required. All input to the command, including designating input and output files, is done via parameter arguments.
General things to be aware of when setting up a CLC Assembly Cell command include:
- For programs where there are choices between fasta and fastq as output formats, the format that is output is determined based on the filename you specify in the command. For example, for the clc_remove_duplicates program, if you provide an output filename ending in .fq or .fastq, then the output format will be fastq. Otherwise, it will be fasta. Any program with this sort of behaviour should include information about the convention used in the usage information produced by running the command without any arguments.
- When providing paired data in two files, where one file contains one member of a pair and the other file contains the other member of a pair, you must include the -i flag in front of each input file. More information is provided about this later in this chapter, when paired data input is discussed, as well as in the chapters on read mappings and de novo assembly.
- When providing information about sequences, such as fragment lengths, also referred to as distances, for paired data, the parameter values you enter will apply to all read files after that point in the command, until the point in the command where new parameter values are provided. This is discussed further in the chapters on read mappings and de novo assembly.
Subsections