Create local BLAST databases

You can create a local database that you can use for local BLAST searches. You can specify a location on your computer to save the BLAST database files to. The Workbench will list the BLAST databases found in these locations when you set up a BLAST against local data.

DNA, RNA, and protein sequences located in the Navigation Area can be used to create BLAST databases from. Any given BLAST database can only include one molecule type. If you wish to use a pre-formatted BLAST database instead, see Add pre-formatted BLAST databases.

To create a BLAST database, go to:

        Toolbox | BLAST (Image blastsearch)| Create BLAST Database (Image create_blast_database)

This opens the dialog seen in figure 13.14.

Image createlocalBLASTdbstep1
Figure 13.14: Add sequences for the BLAST database.

Select sequences or sequence lists you wish to include in your database and click Next.

In the next dialog, shown in figure 13.15, you provide the following information:

Image createlocalBLASTdbstep2
Figure 13.15: Providing a name and description for the database, and the location to save the files to.

Click Finish to create the BLAST database. Once the process is complete, the new database will be available in the Manage BLAST databases dialog, and when running BLAST against local data.

Create BLAST Database creates BLAST+ version 4 (dbV4) databases.

Sequence identifiers and BLAST databases

Restrictions on sequence identifier lengths, format, and duplicates present in the underlying BLAST+ program for making databases, makeblastdb, do not apply when making databases using Create BLAST Database.

Internal handling of sequence names, introduced in version 21.0, allows this level of naming flexibility with newer versions of BLAST+. This, however, has the implication that databases created using Create BLAST Databases in CLC Main Workbench, CLC Genomics Workbench or CLC Genomics Server version 21.0 and later are intended for use only with the BLAST search tools in these software versions.

There should be no obvious effects of this internal handling of sequence names on local BLAST search results, including the names written to BLAST reports.