BLAST at NCBI

When running a BLAST search at the NCBI, the Workbench sends the sequences you select to the NCBI's BLAST servers. When the results are ready, they will be automatically downloaded and displayed in the Workbench. When you enter a large number of sequences for searching with BLAST, the Workbench automatically splits the sequences up into smaller subsets and sends one subset at the time to NCBI. This is to avoid exceeding any internal limits the NCBI places on the number of sequences that can be submitted to them for BLAST searching. The size of the subset created in the CLC software depends both on the number and size of the sequences.

To start a BLAST job to search your sequences against databases held at the NCBI, go to:

        Tools | BLAST (Image blastsearch)| BLAST at NCBI (Image blast_ncbi)

Alternatively, use the keyboard shortcut: Ctrl+Shift+B for Windows and Image command_key_web +Shift+B on Mac OS.

In the first wizard step, select one or more sequences or sequence lists of the same type, DNA or protein (figure 26.2).

Image NCBIBLASTsearchstep1
Figure 26.2: Specify one or more query sequences or sequence lists for the BLAST search.

In the next wizard step, specify the type of BLAST search to run and the database to search (figure 26.3). Only databases relevant to the selected search type will be listed.

Image NCBIBLASTsearchstep2
Figure 26.3: Specify the type of search to run and the database to search.

BLAST programs for DNA query sequences:

BLAST programs for protein query sequences:

Note: Hits found in the Protein Data Bank proteins (pdb) database, can be downloaded and opened with the 3D view.

In the following wizard step, the settings for the search can be refined (figure 26.4).

Image NCBIBLASTsearchstep3
Figure 26.4: The settings for the BLAST search can be customized.

If blastx is selected as the program to use, an option for specifying the genetic code to use for translating the query sequence will be available. If tblastx is selected, options for the genetic code to use for translating the database and for translating the query sequences will be available.

BLAST search parameters are described below. See https://blast.ncbi.nlm.nih.gov/doc/blast-topics/ for further details.

The parameters you choose will affect how long BLAST takes to run. A search of a small database, requesting only hits that meet stringent criteria will generally be quite quick. Searching large databases, or allowing for very remote matches, will of course take longer.

Click Finish to start the tool.

BLAST a partial sequence against NCBI

You can search a database using only a part of a sequence directly from the sequence view:

        select the sequence region to send to BLAST | right-click the selection | BLAST Selection Against NCBI (Image blast_ncbi)

This will go directly to the dialog shown in figure 26.3 and the rest of the options are the same as when performing a BLAST search with a full sequence.