Prepare Raw Data
The Prepare Raw Data workflow performs quality control and trimming of the sequencing reads.
Launching the workflow
The Prepare Raw Data workflow is at:
Toolbox | Template Workflows | Preparing Raw Data| Prepare Raw Data ()
See figure 12.63.
Figure 12.63: Template Workflows in the Toolbox.
Launch the workflow and step through the wizard.
- Select the reads to be processed (figure 12.64).
Figure 12.64: Select the sequence lists containing the reads to be processed. - If more than one sequence list has been selected, check the Batch checkbox to analyze each input separately. This will generate a trimmed sequence list and QC report for each sequence list provided. See Running workflows in batch mode for further details about Batch mode.
If multiple sequence lists are input and the workflow is not run in Batch mode, a single sequence list and a single QC report are generated, based on all the reads from all the sequence lists.
- In the next step, choose "Use organization of input data" to specify how to define the batch units.
- Next, you can review the batch units resulting from your selections above.
- Settings for the Trim Reads tool can then be reviewed and adjusted (figure 12.65).
- In the next step, you can click on Preview All Parameters to review your settings and specify how results should be handled.
- In the final step, you choose a location to save the results to.
Tools in the workflow and outputs generated
The Prepare Raw Data workflow contains two tools, QC for Sequencing Reads (QC for Sequencing Reads) (figure 12.66) and Trim Reads (Trim Reads) (figure 12.66).
Figure 12.66: The design of the Prepare Raw Data workflow.
The outputs provided by this workflow are:
- QC graphic report and QC supplementary report See QC for Sequencing Reads for further details about these reports.
- Trimming report See NGS Trim Reads tool for further details.
- Sequence Lists containing trimmed paired reads and broken paired reads.
The reports generated should be inspected to determine whether the quality of the sequencing reads and the trimming are acceptable. If the quality is acceptable, the trimmed reads can be used in downstream analyses.