Find Prokaryotic Genes (beta)
The Find Prokaryotic Genes (beta) tool allows you to annotate a DNA sequence with CDS information. The tool is currently in beta as it is now only tailored for use with near-complete single genomes, and not for metagenome data.
The tool creates a gene prediction model from the input sequence, which estimates GC content, conserved sequences corresponding to ribosomal binding sites, start and stop codon usages, and a statistical model (namely, an Interpolated Markov Model) for estimating the probability of a sequence to be part of a gene compared to the background. The model is then used to predict coding sequences from the input sequence. Note that this tool is inspired by Glimmer 3 (see http://ccb.jhu.edu/papers/glimmer3.pdf) and currently consolidate in one tool both build-icm and glimmer3. The standard version of Glimmer is available at https://ccb.jhu.edu/software/glimmer/.
To start the analysis, go to:
Metagenomics () | Functional Analysis () | Find Prokaryotic Genes (beta) ()
In the first dialog, select input sequences. The input should consist of one or few contigs from the same species. If several sequences are provided as input and the "Assembly_ID" annotations are present, the tool will build a separate model for each assembly. The tool can also be run in batch mode.
In the second dialog, it is possible to set the following parameters:
- Genetic Code: The genetic code to use (default to bacterial). This genetic code is used to determine which stop codons should be used and to compute a background distribution for amino-acid usage.
- Delete Existing CDS and Gene Annotations. This is selected by default in order to avoid having many duplicate annotations. Unchecking is useful if one wants to compare the results with other annotations.
The tool will output a copy of the input sequence with CDS and Gene annotations.