Extract sequences

This tool allows the extraction of sequences from other types of data in the Workbench, such as mappings or alignments. The data types you can extract sequences from are:

When this tool is run, all sequences are extracted from the data used as input. If only a subset of the sequences are desired, for example, the reads from just a small area of a mapping, or the sequences for only a few blast results, then a data set containing just this subsection or subset should be created and the Extract Sequences tool should be run on that.

The Extract Sequences tool can be launched via the Toolbox menu:

        Toolbox | Classical Sequence Analysis (Image gene_and_protein_analysis)| Classical Sequence Analysis (Image gene_and_protein_analysis) | General Sequence Analysis (Image generalsequenceanalyses)| Extract Sequences (Image extractsequences)

Alternatively, on all the data types listed above except sequence lists, the option to run this tool appears by right clicking in the relevant area: a row in a table or in the read area of mapping data. An example is shown in figure 14.1.

Please note that for mappings, only the read sequences are extracted. Reference and consensus sequences are not extracted using this tool. Similarly, when extracting sequences from BLAST results, the sequence hits are extracted, not the original query sequence or a consensus sequence.

Image extract_sequences_rightclick
Figure 14.1: Right click somewhere in the reads track area and select "Extract Sequences".

Image extract_sequences
Figure 14.2: Choosing whether the extracted sequences should be placed in a new list or as single sequences.

The dialog allows you to select the Destination. Here you can choose whether the extracted sequences should be extracted as single sequences or placed in a new sequence list. For most data types, it will make most sense to choose to extract the sequences into a sequence list. The exception to this is when working with a sequence list, where choosing to extract to a sequence list would create a copy of the same sequence list. In this case, the other option would generally be chosen. This would then result in the generation of individual sequence objects for each sequence in the sequence list.

Below these options, in the dialog, you can see the number of sequences that will be extracted.

Note! When the Extract Sequences tool is run, all sequences are extracted from the data used as input. If only a subset of the sequences is desired, for example, the reads from just a small area of a mapping, or the sequences for only a few blast results, then a data set containing just this subsection or subset should be created and the Extract Sequences tool should be run on that. For extracting a subset of a mapping, please see Extract parts of a mapping that describes the function "Extract from Selection" that also can be selected from the right click menu (see figure 14.1).