Using quality scores when merging

Quality scores come into play in two different ways when merging overlapping pairs.

First, if there is a conflict between the reads in a pair (i.e. a mismatch or gap in the alignment), quality scores are used to determine which base the merged read should have at a given position. The base with the highest quality score will be the one used. In the case of gaps, the average of the quality scores of the two surrounding bases will be used. In the case that two conflicting bases have the same quality or both reads have no quality scores, an [IUPAC ambiguity code](link to manual section on this) representing these bases will be inserted.

Second, the quality scores of the merged read reflect the quality scores of the input reads. When the two reads agree at a position, the two quality scores are summed to form the quality score of the base in the new read (the score is capped at the maximum value on the quality score scale which is 64). If the two bases disagree at a position, the quality score of the base in the new read will be determined by subtracting the lowest score from the highest score of the input reads. If the two scores of the input reads are approximately equal, the resulting score will be very low which will reflect the fact that it is a very unreliable base. On the other hand, if one score is very low and the other is high, it is likely that the base with the high quality score is indeed correct, and this will be reflected in a relatively high quality score.