Manuals
Browse the manual
Introduction to CLC Genomics Workbench
Contact information and citation
Download and installation
Program download
Installation on Microsoft Windows
Installation on macOS
Installation on Linux with an installer
System requirements
Limitations on maximum number of cores
Workbench Licenses
Request an evaluation license
Download a license using a license order ID
Import a license from a file
Upgrade license
Configure license server connection
Download a static license on a non-networked machine
Viewing mode
Start in safe mode
Plugins
Install
Uninstall
Updating plugins
Network configuration
CLC Server connection
Getting started and latest improvements
User interface
View Area
Open view
History and Elements Info views
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Moving a view to a different screen
Side Panel
Zoom and selection in View Area
Zoom in
Zoom out
Selecting, panning and zooming
Toolbox and Status Bar
Processes
Toolbox
Favorites
Status Bar
Workspace
List of shortcuts
Data management and search
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete, restore and remove elements
Show folder elements in a table
Working with tables
Filtering tables
Customized attributes on data locations
Filling in values
What happens when a clc object is copied to another data location?
Searching
Local search
Quick search
Advanced search
User preferences and settings
General preferences
View preferences
Import and export Side Panel settings
Data preferences
Advanced preferences
Export/import of preferences
View settings for the Side Panel
Printing
Selecting which part of the view to print
Page setup
Print preview
Import/export of data and graphics
Standard import
External files
Import tracks
GFF3 format
VCF import
Import high-throughput sequencing data
QIAGEN GeneReader
Illumina
PacBio
Fasta read files
Sanger sequencing data
Ion Torrent
General notes on handling paired data
SAM and BAM mapping files
Import RNA spike-in controls
Import Primers
Import Primer Pairs
Data export
Export formats
Export parameters
Specifying the exported file name(s)
Export of folders and data elements in CLC format
Export of dependent elements
Export of tables
Export in VCF format
JSON export
Export history
Backing up data from the CLC Workbench
Export graphics to files
File formats
Export graph data points to a file
CLC Server data import and export
Copy/paste view output
Data download
Search for Sequences at NCBI
NCBI search options
Handling of NCBI search results
Search for PDB Structures at NCBI
Structure search options
Handling of NCBI structure search results
Save structure search parameters
Search for Sequences in UniProt (Swiss-Prot/TrEMBL)
UniProt search options
Handling of UniProt search results
Save UniProt search parameters
SRA search
SRA search options
SRA search output
Downloading reads and metadata from SRA
How reads are downloaded
Sequence web info
References management
Download Genomes
QIAGEN Sets
Custom Sets
Copy to References
Export a Custom Data Set
Import a Custom Data Set
Imported Data
Exporting reference data outside of the Reference Data Manager framework
Running tools, handling results and batching
Running tools
Handling results
Running a tool on a CLC Server
Batch processing
Standard batch processing
Batch overview
Parameters for batch runs
Running the analysis and organizing the results
Metadata
Creating metadata tables
Importing metadata
Creating a metadata table directly in the Workbench
Associating data elements with metadata
Associate Data Automatically
Associate Data with Row
Working with data and metadata
Finding data elements based on metadata
Viewing metadata associations
Removing metadata associations
Identifying metadata rows without associated data
Moving, copying and exporting metadata
Editing Metadata tables
Workflows
Creating a workflow
Adding elements to a workflow
Connecting workflow elements
Ordering inputs
Validating a workflow
Editing existing workflows
Snippets in workflows
Workflow visualization
Workflow elements
Anatomy of workflow elements
Workflow element coloring
Basic configuration of workflow elements
Configuring input and output elements
Track lists as workflow outputs
Input modifying tools
Launching workflows individually and in batches
Importing data on the fly
Workflow outputs and workflow result metadata tables
Running workflows in batch mode
Batching workflows with more than one input changing per run
Batching part of a workflow
Iterate
Collect and Distribute
Running part of a workflow multiple times
Advanced workflow batching
Multiple levels of batching
Splitting paths in a workflow
Matching up inputs with each other and analyzing them together later in the workflow
Managing workflows
Updating workflows
Creating a workflow installation file
Installing a workflow
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Sequence Lists
BLAST search
Running BLAST searches
BLAST at NCBI
BLAST against local data
Output from BLAST searches
Graphical overview for each query sequence
Overview BLAST table
BLAST graphics
BLAST HSP table
BLAST hit table
Extracting a consensus sequence from a BLAST result
Local BLAST databases
Make pre-formatted BLAST databases available
Download NCBI pre-formatted BLAST databases
Create local BLAST databases
Manage BLAST databases
Bioinformatics explained: BLAST
How does BLAST work?
Which BLAST program should I use?
Which BLAST options should I change?
Explanation of the BLAST output
Can I BLAST against my own sequence database?
What you cannot get out of BLAST
Other useful resources
3D Molecule Viewer
Importing molecule structure files
From the Protein Data Bank
From your own file system
BLAST search against the PDB database
Import issues
Viewing molecular structures in 3D
Customizing the visualization
Visualization styles and colors
Project settings
Tools for linking sequence and structure
Show sequence associated with molecule
Link sequence or sequence alignment to structure
Transfer annotations between sequence and structure
Align Protein Structure
Example: alignment of calmodulin
The Align Protein Structure algorithm
Generate Biomolecule
General sequence analyses
Extract sequences
Shuffle sequence
Dot plots
Create dot plots
View dot plots
Bioinformatics explained: Dot plots
Bioinformatics explained: Scoring matrices
Local complexity plot
Sequence statistics
Bioinformatics explained: Protein statistics
Join Sequences
Pattern discovery
Pattern discovery search parameters
Pattern search output
Motif Search
Dynamic motifs
Motif search from the Toolbox
Java regular expressions
Create motif list
Nucleotide analyses
Convert DNA to RNA
Convert RNA to DNA
Reverse complements of sequences
Translation of DNA or RNA to protein
Find open reading frames
Open reading frame parameters
Protein analyses
Protein charge
Antigenicity
Hydrophobicity
Hydrophobicity graphs along sequence
Bioinformatics explained: Protein hydrophobicity
Download Pfam Database
Pfam domain search
Find and Model Structure
Create structure model
Model structure
Secondary structure prediction
Protein report
Reverse translation from protein into DNA
Bioinformatics explained: Reverse translation
Proteolytic cleavage detection
Bioinformatics explained: Proteolytic cleavage
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Standard PCR
When a single primer region is defined
When both forward and reverse regions are defined
Standard PCR output table
Nested PCR
TaqMan
Sequencing primers
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Sequencing data analyses
Importing and viewing trace data
Trace settings in the Side Panel
Trim sequences
Trimming using the Trim tool
Manual trimming
Assemble sequences
Assemble sequences to reference
Sort sequences by name
Add sequences to an existing contig
View and edit contigs and read mappings
View settings in the Side Panel
Editing a contig or read mapping
Sorting reads
Read conflicts
Using the mapping
Extract reads from a mapping
Variance table
Reassemble contig
Secondary peak calling
Cutting and cloning
Restriction site analyses
Dynamic restriction sites
Restriction Site Analysis
Restriction enzyme lists
Molecular cloning
Introduction to the cloning editor
The cloning workflow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Gel electrophoresis
Gel view
Sequence alignment
Create an alignment
Gap costs
Fast or accurate alignment algorithm
Aligning alignments
Fixpoints
View alignments
Bioinformatics explained: Sequence logo
Edit alignments
Realignment
Join alignments
Pairwise comparison
The pairwise comparison table
Bioinformatics explained: Multiple alignments
Phylogenetic trees
K-mer Based Tree Construction
Create tree
Model Testing
Maximum Likelihood Phylogeny
Bioinformatics explained
Tree Settings
Minimap
Tree layout
Node settings
Label settings
Background settings
Branch layout
Bootstrap settings
Visualizing metadata
Node right click menu
Metadata and phylogenetic trees
Table Settings and Filtering
Add or modify metadata on a tree
Undefined metadata values on a tree
Selection of specific nodes
RNA structure
RNA secondary structure prediction
Selecting sequences for prediction
Secondary structure prediction parameters
Structure as annotation
View and edit secondary structures
Graphical view and editing of secondary structure
Tabular view of structures and energy contributions
Symbolic representation in sequence view
Probability-based coloring
Evaluate structure hypothesis
Selecting sequences for evaluation
Probabilities
Structure scanning plot
Selecting sequences for scanning
The structure scanning result
Bioinformatics explained: RNA structure prediction by minimum free energy minimization
The algorithm
Structure elements and their energy contribution
Tracks
Track types
Visualizing, zooming and navigating tracks
Showing a track in a table
Track lists
Adding, removing and reordering tracks
Open track from a track list in table view
Finding annotations on the genome
Extract sequences from tracks
Creating track lists in workflows
Retrieving reference data tracks
Merge Annotation Tracks
Track Conversion
Convert to Tracks
Convert from Tracks
Annotate and Filter
Filter on Custom Criteria
Annotate with Overlap Information
Filter Annotations on Name
Filter Based on Overlap
Graphs
Create GC Content Graph
Create Mapping Graph
Identify Graph Threshold Areas
Prepare sequencing data
QC for Sequencing Reads
Per-sequence analysis
Per-base analysis
Over-representation analyses
Trim Reads
Quality trimming
Adapter trimming
Trim adapter list
Homopolymer trimming
Sequence filtering
Trim output
Demultiplex Reads
Demultiplexing single reads
Demultiplexing paired reads
Entering barcodes
Demultiplexing output options
An example using Illumina barcoded sequences
Quality control for resequencing analysis
QC for Targeted Sequencing
Coverage summary report
Per-region statistics
Coverage table
Coverage graph
QC for Read Mapping
References
Mapped read statistics
Statistics table for each mapping
Whole Genome Coverage Analysis
Combine Reports
Report types supported
Read mapping
Map Reads to Reference
Selecting the reads
References and masking
Mapping parameters
Mapping paired reads
Non-specific matches
Gap placement
Mapping computational requirements
Reference caching
Mapping output options
Summary mapping report
Reads tracks and stand-alone read mappings
Coloring of mapped reads
Reads tracks
Stand-alone read mapping
Local Realignment
Method
Realignment of unaligned ends
Guided realignment
Multi-pass local realignment
Known limitations
Computational requirements
Run the Local Realignment tool
Merge Read Mappings
Remove Duplicate Mapped Reads
Algorithm details and parameters
Running the duplicate reads removal
Extract Consensus Sequence
Variant detection
Variant Detection tools
Differences in the variants called by the different tools
How the variant detection tools work
Detailed information about overlapping paired reads
Fixed Ploidy Variant Detection
Low Frequency Variant Detection
Basic Variant Detection
Variant Detection - filters
General filters
Noise filters
Variant Detection - the outputs
Variant tracks
The annotated variant table
The variant detection report
Fixed Ploidy and Low Frequency Detection tools: detailed descriptions
Variant Detection - error model estimation
The Fixed Ploidy Variant Detection tool: Models and methods
The Low Frequency Variant Detection tool: Models and methods
Copy Number Variant Detection
The Copy Number Variant Detection tool
Region-level CNV track (Region CNVs)
Target-level CNV track (Target CNVs)
Gene-level annotation track (Gene CNVs)
CNV results report
CNV algorithm report
Identify Known Mutations from Sample Mappings
Run the Identify Known Mutations from Sample Mappings tool
Output from the Identify Known Mutations from Sample Mappings tool
InDels and Structural Variants
Run the InDels and Structural Variants tool
The Structural Variants and InDels output
The InDels and Structural Variants detection algorithm
Theoretically expected structural variant signatures
How sequence complexity is calculated
Resequencing analysis
Variant filtering
Filter against Known Variants
Remove Marginal Variants
Remove Homozygous Reference Variants
Remove Variants Present in Control Reads
Variant annotation
Annotate from Known Variants
Remove Information from Variants
Annotate with Conservation Score
Annotate with Exon Numbers
Annotate with Flanking Sequence
Variants comparison
Identify Shared Variants
Identify Enriched Variants in Case vs Control Samples
Trio Analysis
Variant quality control
Create Variant Track Statistics Report
Functional consequences
Amino Acid Changes
Predict Splice Site Effect
GO Enrichment Analysis
Download 3D Protein Structure Database
Link Variants to 3D Protein Structure
RNA-seq and Small RNA analysis
RNA-seq normalization
RNA-Seq Analysis
Reads and reference settings
Mapping settings
The EM estimation algorithm
Expression settings
Output settings
RNA-Seq result handling
Expression tracks
RNA-seq reads track
RNA-Seq report
Gene fusion reporting
PCA for RNA-Seq
Principal component analysis plot (2D)
Principal component analysis plot (3D)
Differential Expression
The GLM model
Differential Expression in Two Groups
Differential Expression for RNA-Seq
Output of the Differential Expression tools
Create Heat Map for RNA-Seq
Clustering of features and samples
The heat map view
Create Expression Browser
The expression browser
Create Venn Diagram for RNA-Seq
Venn diagram table view
Gene Set Test
Tool output and GAF file comparison
miRNA analysis
Quantify miRNA
Quantify miRNA outputs
Naming isomiRs
Annotate with RNAcentral Accession Numbers
Create Combined miRNA Report
Microarray analysis
Experimental design
Setting up an experiment
Organization of the experiment table
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Create Box Plot
Hierarchical Clustering of Samples
Principal Component Analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Statistical analysis - identifying differential expression
Empirical analysis of DGE
Tests on proportions
Gaussian-based tests
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Annotation tests
Hypergeometric Tests on Annotations
Gene Set Enrichment Analysis
General plots
Histogram
MA plot
Scatter plot
De Novo sequencing
The CLC de novo assembly algorithm
Resolve repeats using reads
Automatic paired distance estimation
Optimization of the graph using paired reads
AGP export
Bubble resolution
Converting the graph to contig sequences
Summary
De Novo Assembly
Best practices
Randomness in the results
De novo assembly parameters
De novo assembly report
De novo assembly output
Map Reads to Contigs
Epigenomics analysis
Histone Chip-Seq
ChIP-Seq Analysis
Quality Control of ChIP-Seq data
Learning peak shapes
Applying peak shape filters to call peaks
Running the Transcription Factor ChIP-Seq tool
Peak track
Annotate with nearby gene information
Bisulfite Sequencing
Detecting DNA methylation
Map Bisulfite Reads to Reference
Call Methylation Levels
Create RRBS-fragment Track
Advanced Peak Shape Tools
Learn Peak Shape Filter
Apply Peak Shape Filter
Score Regions
Utility tools
Batch Rename
Extract Annotations
Sample reads
Extract Reads
Merge Overlapping Pairs
Legacy tools
Compare Sample Variant Tracks
Remove Reference Variants
Reverse sequence
Import Roche 454
Create Combined RNA-Seq Report
Create Track from Experiment
Small RNA Analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Batch launching workflows with multiple inputs
Appendix
Use of multi-core computers
Graph preferences
BLAST databases
Peptide sequence databases
Nucleotide sequence databases
Adding more databases
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
miRBase data file format
List of graphics data formats
SAM/BAM export format specification
Flags
Gene expression annotation files and microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
21. Trematode Mitochondrial
22. Scenedesmus Obliquus Mitochondrial
23. Thraustochytrium Mitochondrial
24. Pterobranchia Mitochondrial
25. Candidate Division SR1 and Gracilibacteria
Custom codon frequency tables
Comparison of track comparison tools
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
Alignment formats
File type
Suffix
Import
Export
Description
Aligned fasta
.fa
X
X
Simple fasta-based format with
-
for gaps
CLC
.clc
X
X
Rich format including all information
ClustalW
.aln
X
X
GCG Alignment
.msf
X
X
Nexus
.nxs/.nexus
X
X
Phylip Alignment
.phy
X
X