Create UMI Reads for miRNA

UMI reads are created with the Create UMI Reads for miRNA tool. This tool takes a sequence list as an input (reads including UMI sequences), and outputs a new sequence list where UMI reads have been merged. In the resulting sequence list, only the small RNA sequences are present annotated with (rather than containing) the UMIs. The output of this tool can be directly used in the small RNA quantification tool.

The expected read structure of the original (untrimmed) input is illustrated in (figure 4.9). It is therefore important to trim the 5' adapter on Ion Torrent reads before running the Create UMI Reads for miRNA tool.

Image miRNAoriginalstructure
Figure 4.9: Illumina and Ion Torrent expected read structure.

The steps for merging UMI reads are as follows:

1/ The structure of the reads is analyzed. The common sequence is identified, and the small RNA sequence (preceding the common sequence) as well as the UMI (12 nucleotides following the common sequence) are identified. Reads where the common sequence is not found, or where the lengths of the small RNA or UMI do not fulfill the criteria are discarded. Note that it can be configured whether the common sequence should match exactly or whether mismatches - and how many of them, and whether these include indels - are allowed.

2/ Reads, stripped of common sequence and 3' adapter/junk, are grouped into UMI read groups based on exact identity of the small RNA sequence and the UMI. Each UMI read group keeps track of the number of reads merged into it, as well as the average nucleotide-level quality scores (if any) both for the small RNA sequence and the UMI part.

3/ We then attempt to merge "singleton" UMI read groups (containing only 1 read) into one of the existing UMI read groups based on how close the UMI and sequence match. The max number of mismatches for UMIs is set to 1. In addition, as is the case with the Create UMI Reads tool (Create UMI Reads tool):

Each resulting UMI read group produces one read without the UMI fragment in the output sequence list. Details on the statistics can be studies in the generated report.

To start the tool, go to:

        Toolbox | Biomedical Genomics Analysis (Image biomedical_folder_closed_16_n_p) | UMI Tools (Image qiaseqv3_folder_open_16_h_p) | Create UMI Reads for miRNA (Image mirna_umireads_16_n_p)

In the first dialog, choose the sequence list containing miRNA reads including UMI sequences as input. Then click Next to configure the following parameters (figure 4.10):

Image createmiRNAUMIs
Figure 4.10: Input and reference parameters for the Create UMI Reads for miRNA tool.

The tool will output a read mapping of UMI reads, i.e., a read mapping of the merged UMI groups. In the last dialog, choose whether you would like to output a report that will indicate how many reads were ignored and the reason why they were not included in a UMI read. The report also contain group size statistics (see UMI group sizes) useful for QC. You can also output a file containing the discarded reads before opening or saving your results.

Consensus nucleotide calculation is performed following the method described in [Hiatt et al., 2013], and can be summarized as follow: