Browse the manual

Introduction to CLC Main Workbench
- Contact information and citation
- Download and installation
- System requirements
  - Limitations on maximum number of cores
- Workbench Licenses
- Plugins
- Network configuration
User interface
- View Area
- Zoom functionality in the View Area
- Toolbox and Favorites tabs
  - Toolbox tab
  - Favorites tab
- Processes tab and Status bar
- History and Element Info views
- Workspace
- List of shortcuts
Data management and search
- Navigation Area
- Working with non-CLC format files
- Customized attributes on data locations
- Searching for data in CLC Locations
  - Quick Search
  - Local Search
- Backing up data from the CLC Workbench
User preferences and settings
- General preferences
- View preferences
- Data preferences
- Advanced preferences
- Export/import of preferences
- Side Panel view settings
Printing
- Selecting which part of the view to print
- Page setup
- Print preview
Connections to other systems
- CLC Server connection
  - CLC Server data import and export
- AWS Connections
Import of data and graphics
- Standard import
Export of data and graphics
- Data export
- Export graphics to files
  - File formats
- Export graph data points to a file
- Copy/paste view output
Working with tables
- Table view settings and column ordering
- Filtering tables
Data download
- Search for Sequences at NCBI
  - NCBI search options
  - Handling of NCBI search results
- Search for PDB Structures at NCBI
- Search for Sequences in UniProt (Swiss-Prot/TrEMBL)
  - UniProt search options
  - Handling of UniProt search results
- Sequence web info
Running tools, handling results and batching
- Running tools
  - Running a tool on a CLC Server
- Handling results
- Batch processing
Metadata
- Creating metadata tables
  - Importing metadata
  - Creating a metadata table directly in the Workbench
- Associating data elements with metadata
  - Associate Data Automatically
  - Associate Data with Row
- Working with data and metadata
- Moving, copying and exporting metadata
Workflows
- Creating and editing workflows
- Workflow elements
- Launching workflows individually and in batches
- Advanced workflow batching
  - Batching workflows with more than one input changing per run
  - Multiple levels of batching
- Managing workflows
Viewing and editing sequences
- Sequence Lists
- View sequences
- Working with annotations
- Element information
- View as text
3D Molecule Viewer
- Importing molecule structure files
- Viewing molecular structures in 3D
- Customizing the visualization
  - Visualization styles and colors
  - Project settings
- Tools for linking sequence and structure
- Align Protein Structure
  - Example: alignment of calmodulin
  - The Align Protein Structure algorithm
- Generate Biomolecule
Sequence alignment
- Create an alignment
- View alignments
  - Bioinformatics explained: Sequence logo
- Edit alignments
  - Realignment
- Join alignments
- Pairwise comparison
  - The pairwise comparison table
  - Bioinformatics explained: Multiple alignments
Phylogenetic trees
- K-mer Based Tree Construction
- Create tree
- Model Testing
- Maximum Likelihood Phylogeny
  - Bioinformatics explained
- Tree Settings
- Metadata and phylogenetic trees
General sequence analyses
- Annotate with GFF/GTF/GVF file
- Extract sequences
- Shuffle sequence
- Dot plots
- Local complexity plot
- Sequence statistics
  - Bioinformatics explained: Protein statistics
- Join Sequences
- Pattern discovery
  - Pattern discovery search parameters
  - Pattern search output
- Motif Search
- Create motif list
Nucleotide analyses
- Convert DNA to RNA
- Convert RNA to DNA
- Reverse complements of sequences
- Translation of DNA or RNA to protein
- Find open reading frames
  - Open reading frame parameters
Protein analyses
- Protein charge
- Antigenicity
- Hydrophobicity
  - Hydrophobicity graphs along sequence
  - Bioinformatics explained: Protein hydrophobicity
- Download Pfam Database
- Pfam domain search
- Download 3D Protein Structure Database
- Find and Model Structure
  - Create structure model
  - Model structure
- Secondary structure prediction
- Protein report
- Reverse translation from protein into DNA
  - Bioinformatics explained: Reverse translation
- Proteolytic cleavage detection
  - Bioinformatics explained: Proteolytic cleavage
Sequencing data analyses and Assembly
- Importing and viewing trace data
  - Trace settings in the Side Panel
- Trim sequences
  - Trimming using the Trim tool
  - Manual trimming
- Assemble sequences
- Assemble sequences to reference
- Sort sequences by name
- Add sequences to an existing contig
- View and edit contigs and read mappings
- Reassemble contig
- Secondary peak calling
- Extract Consensus Sequence
Primers and probes
- Primer design - an introduction
  - General concept
  - Scoring primers
- Setting parameters for primers and probes
  - Primer Parameters
- Graphical display of primer information
  - Compact information mode
  - Detailed information mode
- Output from primer design
- Standard PCR
- Nested PCR
- TaqMan
- Sequencing primers
- Alignment-based primer and probe design
- Analyze primer properties
- Find binding sites and create fragments
  - Binding parameters
  - Results - binding sites and fragments
- Order primers
Cloning and restriction sites
- Restriction site analyses
- Restriction enzyme lists
- Restriction Based Cloning
- Homology Based Cloning
- Gateway cloning
- Gel electrophoresis
  - Gel view
RNA structure
- RNA secondary structure prediction
- View and edit secondary structures
- Evaluate structure hypothesis
  - Selecting sequences for evaluation
  - Probabilities
- Structure scanning plot
  - Selecting sequences for scanning
  - The structure scanning result
- Bioinformatics explained: RNA structure prediction by minimum free energy minimization
  - The algorithm
  - Structure elements and their energy contribution
Expression analysis
- Experimental design
- Transformation and normalization
- Quality control
- Feature clustering
  - Hierarchical clustering of features
  - K-means/medoids clustering
- Statistical analysis - identifying differential expression
- Annotation tests
  - Hypergeometric Tests on Annotations
  - Gene Set Enrichment Analysis
- General plots
BLAST search
- Running BLAST searches
  - BLAST at NCBI
  - BLAST against local data
- Output from BLAST searches
- Local BLAST databases
- Manage BLAST databases
- Bioinformatics explained: BLAST
Utility tools
- Extract Annotated Regions
- Combine Reports
  - Combine Reports output
- Modify Report Type
  - Modifying report types in workflows
- Create Sequence List
- Update Sequence Attributes in Lists
- Split Sequence List
- Rename Elements
- Rename Sequences in Lists
Appendix
- Graph preferences
- BLAST databases
- Proteolytic cleavage enzymes
- Restriction enzymes database configuration
- Technical information about modifying Gateway cloning sites
- IUPAC codes for amino acids
- IUPAC codes for nucleotides
- Formats for import and export
  - List of bioinformatic data formats
  - List of graphics data formats
- Gene expression annotation files and microarray data formats
- Translation Tables
- Custom codon frequency tables
- Matrices for alignment calculation
Bibliography

Bioinformatics explained: Dot plots

Dot plots are two-dimensional plots where the x-axis and y-axis each represents a sequence and the plot itself shows a comparison of these two sequences by a calculated score for each position of the sequence. If a window of fixed size on one sequence (one axis) match to the other sequence a dot is drawn at the plot. Dot plots are one of the oldest methods for comparing two sequences [Maizel and Lenk, 1981].

The scores that are drawn on the plot are affected by several issues.

Scoring matrix for distance correction.
Scoring matrices (BLOSUM and PAM) contain substitution scores for every combination of two amino acids. Thus, these matrices can only be used for dot plots of protein sequences.
Window size
The single residue comparison (bit by bit comparison(window size = 1)) in dot plots will undoubtedly result in a noisy background of the plot. You can imagine that there are many successes in the comparison if you only have four possible residues like in nucleotide sequences. Therefore you can set a window size which is smoothing the dot plot. Instead of comparing single residues it compares subsequences of length set as window size. The score is now calculated with respect to aligning the subsequences.
Threshold
The dot plot shows the calculated scores with colored threshold. Hence you can better recognize the most important similarities.

Subsections

Examples and interpretations of dot plots