Remove and Annotate with Unique Molecular Index

During library preparation of the samples it is possible to add single or duplex UMI sequences to the reads, which are used towards correcting for sequencing errors and to help improve performance. Addition of UMI is often accompanied by a common sequence prefix that is also added before amplification and which can be very helpful when locating the exact UMI sequence. While the UMI is essential in identifying reads that originate from the same fragment, retaining it as such on the sequenced reads would hinder the subsequent read mapping efficiency and accuracy. Therefore, the Remove and Annotate with Unique Molecular Index tool removes the UMI and the common sequence prefix from the reads, while annotating each read with the UMI to retain the fragment identity as annotation.

The tool can be found in the Toolbox here:

        Toolbox | Biomedical Genomics Analysis (Image biomedical_folder_closed_16_n_p) | UMI Tools (Image qiaseqv3_folder_open_16_h_p) | Remove and Annotate with Unique Molecular Index (Image add_remove_molbarcode_16_h_p)

In the first dialog, select sequence list(s) (Image seq_list_nucleotide) containing the reads.

In the Settings dialog (figure 4.1), the following options are available:

Image removeannotatesettings
Figure 4.1: Settings.

A report can be generated that contains information about the number of reads processed, and the number and fraction of reads found to have UMIs. It also includes a plot of the nucleotide distribution per position of the UMI barcode.

This report can be used together with the Combine Reports tool (see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Combine_Reports.html)