Find Prokaryotic Genes

The Find Prokaryotic Genes tool allows you to annotate a DNA sequence with CDS information. The tool is currently in beta as it is now only tailored for use with near-complete single genomes, and not for metagenome data.

The tool creates a gene prediction model from the input sequence, which estimates GC content, conserved sequences corresponding to ribosomal binding sites, start and stop codon usages, and a statistical model (namely, an Interpolated Markov Model) for estimating the probability of a sequence to be part of a gene compared to the background. The model is then used to predict coding sequences from the input sequence. Note that this tool is inspired by Glimmer 3 (see http://ccb.jhu.edu/papers/glimmer3.pdf) and currently consolidate in one tool both build-icm and glimmer3.

To start the analysis, go to:

        Metagenomics (Image wma_folder_open_flat_16_n_p) | Functional Analysis (Image functional_analysis_folder_closed_16_n_p) | Find Prokaryotic Genes (Image find_prok_genes_16_n_p)

In the first dialog, select input sequences. The input should consist of one or few contigs from the same species. If several sequences are provided as input and the "Assembly_ID" annotations are present, the tool will build a separate model for each assembly. The tool can also be run in batch mode.

In the second dialog, it is possible to set the following parameters:

The tool will output a copy of the input sequence with CDS and Gene annotations.