Create UMI Reads

The tool Create UMI Reads generates a single consensus read, called a UMI read, from reads which have the same UMI, and places the UMI read in a read mapping at the location of the original reads. Therefore, the output of the tool is a read mapping of generated UMI reads.

The tool can be found in the Toolbox here:

        Tools | QIAseq Panel Expert Tools | QIAseq DNA Panel Expert Tools (Image qiaseqv3_folder_open_16_h_p) | Create UMI Reads (Image create_super_reads_16_h_p)

In the first dialog (figure 3.16), select a read mapping of the original reads with UMI annotations that was previously handled with the Calculate Unique Molecular Index Groups tool.

Image createsupereads
Figure 3.16: Select a read mapping of the original reads with UMI annotations.

The second dialog of the wizard (figure 3.17) offers the following options.

Image createsupereads2
Figure 3.17: Settings for the Create UMI Reads tool.

Click Next to Open or Save the resulting read mapping of UMI reads, i.e., a read mapping of the merged UMI groups. It is also possible to generate a report that will indicate how many reads were ignored and the reason why they were not included in a UMI read. This data will let you verify the found variants, and examine why expected variants were not found.

Consensus nucleotide calculation is performed following the method described in Hiatt2013. The consensus base is chosen so that the posterior probability of the observed read bases is maximized.

In order to maximize the posterior probability of calling a base, i.e.,

Image form1

where Oi is the observed base at a given position, C the base in question, and where all possible bases are summed up in the denominator, e.g. B=A,T,C,G.

Assuming that the prior for observing any base is equal, i.e., P(A)=P(T)=P(C)=P(G), then the posterior probability is:

Image form2

And by assuming each read base observation is independent,

Image form3

To obtain the consensus base we only need to maximize the numerator.

The Q-score is now simply the probability of making a wrong call, i.e.

Image form4

which means that the Q-score is

Image form5

Q-scores are capped at 60.