Motif
Motifs are short, recurring patterns in nucleotide or peptide sequences that often indicate biological function or structural properties. Motifs are used to locate specific patterns within a target sequence, either by direct sequence comparison or by applying regular expressions for more flexible matching.
Motifs can be used in sequence analysis for applications that go beyond simply locating patterns.
Motifs are stored in a motif list (
), which can be created using one of the following approaches:
- Create Motif List Manually define the motifs.
- Create Motif List from Sequences Automatically generate motifs from existing sequences or their annotations.
There are two ways to search for motifs:
- Interactive motif search Search for and visualize common or custom motifs directly within a sequence view.
- Motif Search Perform a systematic search for motifs to produce a results table and optionally add annotations to the sequences.
Motif types
Three motif types are supported:
- Simple motif A literal sequence, such as ATGATGNNATG. Ambiguous nucleotides and amino acids are supported using standard IUPAC codes.
- Java regular expression A pattern defined using Java regular expressions syntax.
- Prosite regular expression A protein motif based on the PROSITE database, which contains a wide range of curated protein patterns and is frequently used to identify related proteins.
Subsections
- Create Motif List
- Create Motif List from Sequences
- Interactive motif search
- Motif Search
- Using motifs in sequence analysis
- Java regular expressions
