|VCF||.vcf||X||X||See note below|
|GVF||.gvf||X||X||Special version of GFF for variant data|
|GTF||.gtf||X||X||Special version of GFF for gene annotation data|
|COSMIC variation database||.tsv||X||Special format for COSMIC data|
|BED||.bed||X||See Import tracks|
|Wiggle||.wig||X||See Import tracks|
|UCSC variant database table dump||.txt||X||See Import tracks|
|Complete genomics master var files||masterVar||X||Complete genomics variant data format|
Special note on VCF export For VCF export, counts from the variant track are put in CLCAD2 tags and coverage in PL tags. The values of the CLCAD2 tag follow the order of REF and ALT, with one value for the REF and for each ALT. When there is no observation in the GT field of the given REF or ALT, the corresponding CLCAD2 value will be 0 (for example if you have a homozygous variant, then the CLCAD2 value for the REF will be 0). This does not mean that the original mapping did not have any reads with that sequence, but it means that the variant track being exported does not contain that variant.
When exporting VCF files, there are three options:
- Reference sequence track
- Since the VCF format specifies that reference and allele sequences cannot be empty, deletions and insertions have to be padded with bases from the reference sequence. The export needs access to the reference sequence track in order to find the neighboring bases.
- Enforce diploid export
- For variants that are homozygous, the VCF export will create one entry in the GT field, unless you choose to Enforce diploid export. If you do that, a homozygous variant will be reported with two entries, separated by "/". If you export a variant track that has been filtered, there can be situations where there is only one heterozygous variant at a given position. In this case, the CLC Genomics Workbench will use a "." to denote an unknown genotype, so the GT field will be "1/.". // It is important to note that this Enforce diploid export option will create a diploid format of the VCF file, but it is not able to recover any inconsistencies in the variant track used as input. If the variant track has three variants at a given position, three genotypes will be output. Or if the variant track has two variants at the same position that both postulate to be homozygous, they will be output as two heterozygous variants. When exporting data created by the variant callers of CLC Genomics Workbench, this is usually not a problem, but when applying this diploid scheme to data that has been imported into the CLC Genomics Workbench from other sources, the data can be inconsistent with a diploid model.
- Some chromosomes can be excepted from the enforced diploid export. For a human genome, that would be relevant for the mitochondrion and for male X and Y chromosomes. For this option, you can select which chromosomes should be excepted. They will be exported in the standard way without assuming there should be two genotypes, and homozygous calls will just have one value in the GT field.