Identify Shared Variants
This tool should be used if you are interested in finding common (frequent) variants in a group of samples. For example one use case could be that you have 50 unrelated individuals with the same disease and would like to identify variants that are present in at least 70% of all individuals. It can also be used to do an overall comparison between samples (a frequency threshold of 0% will report all alleles).
Toolbox | Resequencing Analysis () | Variants Comparison | Identify Shared Variants
This opens a dialog where you can select the variant tracks () from the samples in the group.
Clicking Next will display the dialog shown in figure 27.33.
Figure 27.33: Frequency treshold.
The Frequency threshold is the percentage of samples that have this variant. Setting it to 70% means that at least 70% of the samples selected as input have to contain a given variant for it to be reported in the output.
The output of the analysis is a track with all the variants that passed the frequency thresholds and with additional reporting of:
- Sample count
- The number of samples that have the variant
- Total number of samples
- The total number of samples (this will be identical for all variants).
- Sample frequency
- This is the same frequency that is also used as a threshold (see figure 27.33).
- Origin tracks
- A comma-separated list of the name of the tracks that contain the variant.
Note that this tool can be used for merging all variants from a number of variant tracks into one track by setting the frequency threshold to 0.