Understanding the Bowtie configuration

Once the bowtie.xml has been imported, you can click the CLC bio Bowtie Map header to see the configuration as shown figure 7.7.

Image bowtie1
Figure 7.7: The Bowtie configuration has been imported.

The basic configuration is very much similar to Velvet set-up.

The reads parameter type is User-selected input data, meaning that the users will be able to select the data to use. The type for this parameter is set to FASTA (.fa/.fsa/.fasta) meaning that the data selected by the user should be exported from the Server in fasta format. This is the format that Bowtie will then receive as input.

The index parameter will allow the user to specify the name of the index file to use in the mapping. This could be simplified for the user, and made less subject to error, if one were to choose CSV enum as a parameter type instead of type Text. This would allow an administrator to provide a fixed set of index files that the user could choose from via a drop-down menu in the Workbench Wizard.

The last two options max number of mismatches and report all matches are parameters that are passed to the Bowtie mapper using the value selected by the user in the Workbench Wizard.

The sam file parameter is set as Output file from CL and the option selected is Do not import. The sam file indicates what the output file from Bowtie should be called. A SAM or BAM format file is imported using the SAM/BAM import functionality of the Server, but this requires both the output from Bowtie as well as access to the reference sequences for the mapping. Thus, here we put off the import of the Bowtie output by choosing the option Do not import, allowing us to make use of the Post-processing functionality, where we can add a parameter to those the user will be presented with in the Wizard, this time asking for the reference sequences. This additional information will then allow import of the SAM file as a mapping object.

If you expand the Post-processing panel, you can see the logic needed to handle the SAM file from Bowtie together with the reference sequence provided by the user (see figure 7.8).

Image bowtie2
Figure 7.8: The Bowtie post-processing set-up.

At the top, there is a panel for specifying End user parameters for post processing only which in this case is the reference sequence.

The post-processing algorithm is then specified, in this case Import SAM/BAM Files. The input parameters for this import process are then specified below that. The parameters presented are those you specify within the Post-processing section itself, as well as all those parameters from the top section which could be used as input to the algorithm you have chosen to run.

Here, you get entries for reads, which is not relevant, as well as reference seq. This is because the system seems both of these as sequences, and does not have any way to interpret which may be the relevant sequence object for this job. The administrator can then choose which is relevant. Here it is the reference seq object, and so it is given the value Input data. This choice means that the user will get to select a reference sequence list via the Workbench Wizard when they are setting up their Bowtie mapping job. The sam file is chosen as the file to be imported.