The locus table displays all Genotype track loci, one per row, and the displayed columns are locus annotations as default. Loci where filters apply are displayed when the option 'Show filtered' is enabled. Blue columns are sample specific annotations, corresponding to FORMAT annotations in VCF (figure 10.2).
Below are descriptions of general locus table annotations. Common locus table annotations created by a specific tool can be found here: Microhaplotype Caller - Annotations
- The name of the reference sequence where the variant locus is situated.
- The reference sequence positions of the variant locus. The region may be either a 'single position', a 'region' or a 'between position region'.
- The reference allele nucleotide sequence of the locus. Maximally 20 nucleotides are shown. Longer sequences can be obtained in their entirety by copy-pasting the table cell.
- Alt allele
- Alternatives to the reference allele.
- List of filters that are directly or indirectly applicable to the locus. The value 'PASS' specifies that the variant passed all filters.
- Alt count
- Number of non-reference allele variants at this locus.
- Maximum length of all alleles at the locus.
Figure 10.2: The locus table. To go to the locus table view of a genotype track as shown here, select the table icon with an L in the lower left corner. Note that in this example, "Show filtered" has been checked in the side panel and a filtered locus is shown at the top of the table. Columns displayed can be adjusted by selecting and deselecting columns in the side panel to the right.
The following annotations are available in all sample Genotype tracks.
- The VCF genotype that lists alleles present in the sample genome at this locus. Alleles with applied filters are hence excluded from the genotype.
The genotype is encoded as allele values separated by either of '/' or '|'. The allele values are 0 for the reference allele, 1 for the first allele listed in the 'Alt allele' annotation, 2 for the second allele listed in the 'Alt allele' annotation and so on (figure 10.2). For diploid calls examples could be 0/1, 1|0, or 1/2, etc.
Genotype phasing compatible with all ploidy levels
One way to interpret VCF phasing encoded in the genotype (GT) field, is to consider the separators (/ or |) as phasing flags for the following allele, similar to the way phasing is encoded in BCF. For example in case of GT=0/1|2/3 we would know that 1 and 3 are unphased while 2 is phased. This interpretation, however, leaves the phasing status of the first GT allele poorly defined.
Assumptions are frequently made about the phasing status of the first allele in diploid scenarios. For example GT=0|1 is commonly interpreted to mean that both alleles are phased, and GT=0/1 that both alleles are unphased.
Considering the above, the genotype is written so that:
- At loci where all alleles except the first have same phasing status: the appropriate phasing flag is prepended. For example GT=|0/1/2 or GT=/0|1|2 or GT=/0|1
- At loci where alleles have mixed phasing status, and the first allele is phased: the appropriate phasing flag is prepended. For example GT=|0|1/2 or GT=|0/2
- In the case of haploid loci: if the allele is phased, either the appropriate phasing flag is prepended (e.g. GT=|1), or a missing allele is appended with the appropriate phasing flag (e.g. GT=1|.).
Thus, when encountering a genotype where the first allele has no prepended phasing flag, we can determine phasing status of the first allele to be:
- phased, if phasing flags are present and indicate that all other alleles are phased.
- unphased, in all other cases.
- Phase set
- Identifier for a set of phased genotypes that together describe a set of overlapping haplotypes.
- The zygosity of the sample genome locus (figure 10.2). This will be 'Homozygous' when only one allele variant is called at the locus, and 'Heterozygous' when more than one variant is called.
- Allele number
- Total number of alleles in called genotype.
- Complex region, if locus is part of a complex of overlapping loci.