Consensus Q-score
The Hiatt Q-score is of the probability of making a wrong call, i.e.
which means that the Hiatt Q-score is
Q-scores are capped at 60.
The probabilistic model outlined above and used in the Hiatt Q-score, assumes the only source of errors are independent sequencing errors. While PCR errors are typically rarer than sequencing errors, PCR errors are not independent and they can affect a large fraction of the reads in an UMI group. For this reason, the Hiatt quality scores will often attain the maximum value of 60, even in situations where the reads constituting the UMI group do not unanimously agree on the base call.
The Fixed Ploidy and Low Frequency Variant Detection tools both rely on statistical models for the sequencing error rates, which is estimated for each value of the Q-score and each substitution type, for details see https://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Variant_Detection_error_model_estimation. If most quality scores are 60, the variant callers can not differentiate between, i.e. reads with unanimous agreement and reads without, or between small groups with unanimous agreement and large groups with unanimous agreement.
MAGERI Q-scores does not have a probabilistic interpretation as Hiatt Q-scores, but they a more distributed in the set of possible Q-scores, allowing the variant callers to differentiate between qualities. The MAGERI Q-scores is an adaption of the method described in [Shugay et al., 2017].
First, the frequency, f, of the consensus base is computed, only bases with a Q-score above 25 contribute to the frequency computation. A pseudo-count is applied to the denominator, so that larger groups automatically get higher Q-scores:
The MAGERI Q-score is then computed as