Options for clc_find_variations
usage: clc_find_variations <options> Find positions where the reads indicate a consistent difference from the reference sequences. Optionally, consensus sequences can be written to a fasta file. Options: -h / --help: Display this message -a <file> / --cas <file>: Specify the cas file (required). -c <n> / --coverage <n>: Specify minimum coverage to report/apply difference (default: 2) -o <file> / --output <file>: Output consensus sequences to a fasta file. -r <resolution> / --conflictresolution <resolution>: Set the consensus sequences base conflict resolution. Only valid if "-o" option is used: vote: select by vote (A, C, G, T) (default) ambiguity: use ambiguity nucleotides (R, Y, etc.) unknown: unknown nucleotide (N) -z <mode> / --zerocoverage <mode>: Set how regions with zero coverage is written in the consensus sequences. Only valid if "-o" option is used: reference: use reference nucleotide (default) unknown: unknown nucleotide (N) none: do not use any character, i.e. remove zero coverage regions in consensus sequences -q / --quiet: Output no information about the reported sites. -v / --verbose: Show more information about the reported sites. -w / --outputzerocoverage: Output regions where coverage is zero -l <count> / --limit <count>: Show information when more than a given number of reads is different from the consensus. Can only be used with the "-v" option. -f <fraction> / --limitfraction <fraction>: Show information when more than a given fraction of reads is different from the consensus. Can only be used with the "-v" option. If used with the "-l" option, both requirements must be met. -i / --ignoreindels: Ignore indels completely in the analysis. Examples: Find all sites where the reads indicate differences relative to the reference sequence: clc_find_variations -a assembly.cas The differences are printed to stdout. To make a new reference sequence with the differences incorporated, write: clc_find_variations -a mapping.cas -o new_ref.fasta By default, only sites with at least two fold coverage are included in the analysis. To set this to five fold coverage, use the "-c" option: clc_find_variations -c 5 -a mapping.cas With this, differences are only printed for sites with at least five fold coverage. If the "-c" and "-o" options are used together, changes are only made to the reference sequence when the coverage requirement is met. In general, the changes made to the reference sequence when using the "-o" option are exactly those changes output to stdout (except when using the "-q" option where no output is printed). Using the "-l" and/or "-f" options with the "-v" option gives output for sites where no change is indicated, but some significant amount of differences is still present. For example: clc_find_variations -v -l 2 -f 0.2 -a mapping.cas This outputs information for all sites where at least two reads differ from the reference and at least 20% of the reads differ from the reference. Note that when using the "-o" option, the consensus sequences is not affected by the "-q", "-v", "-l" and "-f" options. The "-c", "-r" and "-z" options however, do affect the consensus sequences.