Options for clc_find_variations
usage: clc_find_variations <options>
Find positions where the reads indicate a consistent difference from the
reference sequences. Optionally, consensus sequences can be written to a
fasta file.
Options:
-h / --help: Display this message
-a <file> / --cas <file>: Specify the cas file (required).
-c <n> / --coverage <n>: Specify minimum coverage to report/apply difference
(default: 2)
-o <file> / --output <file>: Output consensus sequences to a fasta file.
-r <resolution> / --conflictresolution <resolution>: Set the consensus
sequences base conflict resolution. Only valid if "-o" option is used:
vote: select by vote (A, C, G, T) (default)
ambiguity: use ambiguity nucleotides (R, Y, etc.)
unknown: unknown nucleotide (N)
-z <mode> / --zerocoverage <mode>: Set how regions with zero coverage is
written in the consensus sequences. Only valid if "-o" option is used:
reference: use reference nucleotide (default)
unknown: unknown nucleotide (N)
none: do not use any character, i.e. remove zero coverage regions in
consensus sequences
-q / --quiet: Output no information about the reported sites.
-v / --verbose: Show more information about the reported sites.
-w / --outputzerocoverage: Output regions where coverage is zero
-l <count> / --limit <count>: Show information when more than a given number
of reads is different from the consensus. Can only be used with the "-v"
option.
-f <fraction> / --limitfraction <fraction>: Show information when more than a
given fraction of reads is different from the consensus. Can only be used
with the "-v" option. If used with the "-l" option, both requirements must
be met.
-i / --ignoreindels: Ignore indels completely in the analysis.
Examples:
Find all sites where the reads indicate differences relative to the reference
sequence:
clc_find_variations -a assembly.cas
The differences are printed to stdout. To make a new reference sequence with
the differences incorporated, write:
clc_find_variations -a mapping.cas -o new_ref.fasta
By default, only sites with at least two fold coverage are included in the
analysis. To set this to five fold coverage, use the "-c" option:
clc_find_variations -c 5 -a mapping.cas
With this, differences are only printed for sites with at least five fold
coverage. If the "-c" and "-o" options are used together, changes are only
made to the reference sequence when the coverage requirement is met. In
general, the changes made to the reference sequence when using the "-o"
option are exactly those changes output to stdout (except when using the "-q"
option where no output is printed).
Using the "-l" and/or "-f" options with the "-v" option gives output for
sites where no change is indicated, but some significant amount of
differences is still present. For example:
clc_find_variations -v -l 2 -f 0.2 -a mapping.cas
This outputs information for all sites where at least two reads differ from
the reference and at least 20% of the reads differ from the reference.
Note that when using the "-o" option, the consensus sequences is not affected
by the "-q", "-v", "-l" and "-f" options. The "-c", "-r" and "-z"
options however, do affect the consensus sequences.
