This tool is used to find suitable protein structures for representing a given protein sequence. From the resulting table, a structure model (homology model) of the sequence can be created by one click, using one of the found protein structures as template.
To run the Find and Model Structure tool:
Toolbox | Sequence Analysis () | Find and Model Structure ()
Note: Before running the tool, a protein structure sequence database must be downloaded and installed using the 'Download Find Structure Database' tool (see Download Find Structure Database).
In the tool wizard step 1, select the amino acid sequence to use as query from the Navigation Area.
In step 2, specify if the output table should be opened or saved.
The Find and Model Structure tool carries out the following steps, to find and rank available structures representing the query sequence:
Input: Query protein sequence
- BLAST against protein structure sequence database
- Filter away low quality hits
- Rank the available structures
In the output table (figure 17.12), the column named "Available Structures" contains links that will invoke a menu with the options to either create a structure model of the query sequence or just download the structure. This is further described in Create structure model. The remaining columns contain additional information originating from the PDB file or from the BLAST search.
The three steps carried out by the Find and Model Structure tool are described in short below.
A local BLAST search is carried out for the query sequence against the protein structure sequence database (see Download Find Structure Database).
BLAST hits with E-value > 0.0001 are rejected and a maximum of 2500 BLAST hits are retrieved. Read more about BLAST in Bioinformatics explained: BLAST.
From the list of BLAST hits, entries are rejected based on the following rules:
- PDB structures with a resolution lower than 4 Å are removed since they cannot be expected to represent a trustworthy atomistic model.
- BLAST hits with an identity to the query sequence lower than 20 % are removed since they most likely would result in inaccurate models.
For the resulting list of available structures, each structure is scored based on its homology to the query sequence, and the quality of the structure itself. The Template quality score is used to rank the structures in the table, and the rank of each structure is shown in the "Rank" column (see figure 17.12). Read more about the Template quality score in Evaluating the rank of available structures.