Using quality scores when merging

Quality scores come into play in two different ways when merging overlapping pairs.

First, in case of a conflict between the reads in a pair (i.e. a mismatch or gap in the alignment), they are used to determine which base the merged read should have at this position. The base with the highest quality score will determine this. In case of gaps, the average of the quality scores of the two surrounding bases will be used.

Second, the quality scores of the merged read reflect the quality scores of the input reads. When the two reads agree at a position, the two quality scores are summed to form the quality score of the base in the new read (the score is capped at the maximum value on the quality score scale which is 64). If the two bases disagree at a position, the quality score of the base in the new read will be determined by subtracting the lowest score from the highest score of the input reads. If the two scores of the input reads are approximately equal, the resulting score will be very low which will reflect the fact that it is a very unreliable base. On the other hand, if one score is very low and the other is high, it is likely that the base with the high quality score is indeed correct, and this will be reflected in a relatively high quality score.