Over-representation analysis
The 5mer analysis examines the enrichment of penta-nucleotides. The enrichment of a 5mer is calculated as the ratio of observed and expected 5mer frequencies. An expected frequency is calculated as product of the empirical nucleotide probabilities that make up the 5mer. (Example: given the 5mer = CCCCC and cytosines have been observed to 20% in the examined sequences, the 5mer expectation is
). Note that 5mers that contain ambiguous bases (anything different from A/T/C/G) are ignored.
- Individual 5mer distribution
- Calculates absolute coverages and enrichment for each 5mer (observed/expected based on background distribution of nucleotides) for each base position and plots position vs enrichment data for the top five enriched 5mers, if present. This analysis will reveal if there is a pattern of bias at different points over your read length. Such a bias might origin from non-trimmed adapter sequences, poly-A tails or other sources.