Amino acid changes
This tool annotates variants with amino acid changes given a track with coding regions and a reference sequence (see figure 27.62).
Figure 27.62: The amino acid changes annotation tool.
The CDS track is used to determine the reading frame to be used for translation. The mRNA track is used to determine whether the variant is inside or outside the region covered by the transcript.
For each variant in the input track, the following information is added:
- Coding region change. This will annotate the relative position on the coding DNA level, using the nomenclature proposed at http://www.hgvs.org/mutnomen/. Variants inside exons and in the untranslated regions of the transcript will also be annotated with the distance to the nearest exon. E.g. "c.-4A>C" describes a SNV four bases upstream of the start codon, while "c.*4A>C" describes a SNV four bases downstream of the stop codon.
- Amino acid change. This will annotate the change on the protein level. For example, single amino-acid changes caused by SNVs are listed as "p.[Gly261Cys]", denoting that in the protein sequence (hence the "p.") the Glycine at position 261 is changed into Cysteine. Frame-shifts caused by indels are listed with the extension fs, for example p.[Pro244fs] denoting a frameshift at position 244 coding for Proline. For further details of the nomenclature see the "Recommendations for the description of protein sequence variants (v2.0)" at http://www.hgvs.org/mutnomen/.
- Coding region change in longest transcript. When there are many transcript variants for a gene, the coding region change for all transcripts are listed in the "Coding region change" column. For quick reference, the longest transcript is often used, and there is a special column only listing the coding region change for the longest transcript.
- Amino acid change in longest transcript. This is similar to the above, just on the protein level.
- Other variants within codon. If there are other variants within the same codon, this column will have a "Yes". In this case, it should be manually investigated whether the two variants are linked by reads and the amino acid change annotated by the amino acid changes may not be correct in this case.
- Non-synonymous. Will have a "Yes" if the variant is non-synonymous.
By filtering in the table view of the result track on the column "Non-synonymous" for "Yes", only variants that change the protein product will be retained in the result track.
Figure 27.63: The resulting amino acid changes in track and table views.
An example of the output is given in Figure 27.63. The top track view displays the variant track, sequence track, gene annotation and CDS track. The lower table view is filtered for non-synonymous variants.