Create local BLAST databases
You can create a local database that you can use for local BLAST searches. You can specify a location on your computer to save the BLAST database files to. The Workbench will list the BLAST databases found in these locations when you set up a BLAST against local data.
DNA, RNA, and protein sequences located in the Navigation Area can be used to create BLAST databases from. Any given BLAST database can only include one molecule type. If you wish to use a pre-formatted BLAST database instead, see Add pre-formatted BLAST databases.
To create a BLAST database, go to:
Toolbox | BLAST ()| Create BLAST Database ()
This opens the dialog seen in figure 14.14.
Figure 14.14: Add sequences for the BLAST database.
Select sequences or sequence lists you wish to include in your database and click Next.
In the next dialog, shown in figure 14.15, you provide the following information:
- Name. The name of the BLAST database. This name will be used when running BLAST searches and also as the base file name for the BLAST database files.
- Description. A short description. This is displayed along with the database name in the list of available databases when launching a local BLAST search. If no description is entered, the creation date is used as the description.
- Location. The location to save the BLAST database files to. You can add or change the locations in this list using the Manage BLAST databases dialog.
Figure 14.15: Providing a name and description for the database, and the location to save the files to.
Click Finish to create the BLAST database. Once the process is complete, the new database will be available in the Manage BLAST databases dialog, and when running BLAST against local data.
Create BLAST Database creates BLAST+ version 4 (dbV4) databases.
Sequence identifiers and BLAST databases
Restrictions on sequence identifier lengths, format, and duplicates present in the underlying BLAST+ program for making databases, makeblastdb, do not apply when making databases using Create BLAST Database.
Internal handling of sequence names, introduced in version 21.0, allows this level of naming flexibility with newer versions of BLAST+. This, however, has the implication that databases created using Create BLAST Databases in CLC Main Workbench, CLC Genomics Workbench or CLC Genomics Server version 21.0 and later are intended for use only with the BLAST search tools in these software versions.
There should be no obvious effects of this internal handling of sequence names on local BLAST search results, including the names written to BLAST reports.