Bioinformatics explained: BLAST
BLAST (Basic Local Alignment Search Tool) has become the defacto standard in search and alignment tools [Altschul et al., 1990]. The BLAST algorithm is still actively being developed and is one of the most cited papers ever written in this field of biology. Many researchers use BLAST as an initial screening of their sequence data from the laboratory and to get an idea of what they are working on. BLAST is far from being basic as the name indicates; it is a highly advanced algorithm which has become very popular due to availability, speed, and accuracy. In short, a BLAST search identifies homologous sequences by searching one or more databases usually hosted by NCBI (http://www.ncbi.nlm.nih.gov/), on the query sequence of interest [McGinnis and Madden, 2004].
BLAST is an open source program and anyone can download and change the program code. This has also given rise to a number of BLAST derivatives; WU-BLAST is probably the most commonly used [Altschul and Gish, 1996].
BLAST is highly scalable and comes in a number of different computer platform configurations which makes usage on both small desktop computers and large computer clusters possible.
BLAST can be used for a lot of different purposes. A few of them are mentioned below.
- Looking for species. If you are sequencing DNA from unknown species, BLAST may help identify the correct species or homologous species.
- Looking for domains. If you BLAST a protein sequence (or a translated nucleotide sequence) BLAST will look for known domains in the query sequence.
- Looking at phylogeny. You can use the BLAST web pages to generate a phylogenetic tree of the BLAST result.
- Mapping DNA to a known chromosome. If you are sequencing a gene from a known species but have no idea of the chromosome location, BLAST can help you. BLAST will show you the position of the query sequence in relation to the hit sequences.
- Annotations. BLAST can also be used to map annotations from one organism to another or look for common genes in two related species.
Searching for homology
Most research projects involving sequencing of either DNA or protein have a requirement for obtaining biological information of the newly sequenced and maybe unknown sequence. If the researchers have no prior information of the sequence and biological content, valuable information can often be obtained using BLAST. The BLAST algorithm will search for homologous sequences in predefined and annotated databases of the users choice.
In an easy and fast way the researcher can gain knowledge of gene or protein function and find evolutionary relations between the newly sequenced DNA and well established data.
After the BLAST search the user will receive a report specifying found homologous sequences and their local alignments to the query sequence.
Subsections
- How does BLAST work?
- Which BLAST program should I use?
- Which BLAST options should I change?
- Explanation of the BLAST output
- Can I BLAST against my own sequence database?
- What you cannot get out of BLAST
- Other useful resources