Manuals
Browse the manual
Introduction to CLC Genomics Workbench
Contact information
Download and installation
Program download
Installation on Microsoft Windows
Installation on Mac OS X
Installation on Linux with an installer
System requirements
Limitations on maximum number of cores
Workbench Licenses
Request an evaluation license
Download a license using a license order ID
Import a license from a file
Upgrade license
Configure license server connection
Download a static license on a non-networked machine
Viewing mode
Start in safe mode
CLC Sequence Viewer vs. Workbenches
When the program is installed: Getting started
Quick start
Plugins
Install
Uninstall
Updating plugins
Network configuration
Latest improvements
User interface
View Area
Open view
History and Info views
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Moving a view to a different screen
Side Panel
Zoom and selection in View Area
Zoom in
Zoom out
Selecting, panning and zooming
Toolbox and Status Bar
Processes
Toolbox
Favorites
Status Bar
Workspace
List of shortcuts
Data management and search
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete, restore and remove elements
Show folder elements in a table
Metadata
Importing Metadata
Advanced Metadata Import
Associating data elements with metadata
Working with data and metadata
Working with tables
Filtering tables
Customized attributes on data locations
Filling in values
What happens when a clc object is copied to another data location?
Searching
Local search
Quick search
Advanced search
User preferences and settings
General preferences
View preferences
Import and export Side Panel settings
Data preferences
Advanced preferences
Export/import of preferences
View settings for the Side Panel
Printing
Selecting which part of the view to print
Page setup
Print preview
Import/export of data and graphics
Standard import
External files
Import tracks
GFF3 format
Import high-throughput sequencing data
Illumina
PacBio
Fasta read files
Sanger sequencing data
Ion Torrent
Complete Genomics
General notes on handling paired data
SAM and BAM mapping files
Import RNA spike-in controls
Data export
Export of folders and multiple elements in CLC format
Export of dependent elements
Export history
The CLC format
Backing up data from the CLC Workbench
Export of tables
Export graphics to files
File formats
Export graph data points to a file
Copy/paste view output
Data download
Download reference genome data
Selecting data types for download
Cytogenetic ideograms
Search for Sequences at NCBI
NCBI search options
Handling of NCBI search results
Search for structures at NCBI
Structure search options
Handling of NCBI structure search results
Save structure search parameters
UniProt (Swiss-Prot/TrEMBL) search
UniProt search options
Handling of UniProt search results
Save UniProt search parameters
SRA search
SRA search options
SRA search output
Downloading reads and metadata from SRA
How reads are downloaded
Sequence web info
Running tools, handling results and batching
Running tools
Handling results
Batch processing
Standard batch processing
Batch overview
Parameters for batch runs
Running the analysis and organizing the results
Batch launching workflows with multiple inputs
Workflows
Creating a workflow
Adding workflow elements
Configuring workflow elements
Locking and unlocking parameters
Connecting workflow elements
Output
Input
Layout
Input modifying tools
Workflow validation
Workflow creation helper tools
Adding to workflows
Snippets in workflows
Change the order of tracks in the Genome Browser View
Distributing and installing workflows
Creating a workflow installation file
Installing a workflow
Managing workflows
Workflow identification and versioning
Automatic update of workflow elements
Executing a workflow
Open copy of installed workflow
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Sequence Lists
BLAST search
Running BLAST searches
BLAST at NCBI
BLAST against local data
Output from BLAST searches
Graphical overview for each query sequence
Overview BLAST table
BLAST graphics
BLAST HSP table
BLAST hit table
Extracting a consensus sequence from a BLAST result
Local BLAST databases
Make pre-formatted BLAST databases available
Download NCBI pre-formatted BLAST databases
Create local BLAST databases
Manage BLAST databases
Bioinformatics explained: BLAST
How does BLAST work?
Which BLAST program should I use?
Which BLAST options should I change?
Explanation of the BLAST output
I want to BLAST against my own sequence database, is this possible?
What you cannot get out of BLAST
Other useful resources
3D Molecule Viewer
Importing molecule structure files
From the Protein Data Bank
From your own file system
BLAST search against the PDB database
Import issues
Viewing molecular structures in 3D
Updating old structure files
Customizing the visualization
Visualization styles and colors
Project settings
Tools for linking sequence and structure
Show sequence associated with molecule
Link sequence or sequence alignment to structure
Transfer annotations between sequence and structure
Protein structure alignment
The Align Protein Structure dialog box
Example: alignment of calmodulin
The Align Protein Structure algorithm
General sequence analyses
Extract Annotations
Extract sequences
Shuffle sequence
Dot plots
Create dot plots
View dot plots
Bioinformatics explained: Dot plots
Bioinformatics explained: Scoring matrices
Local complexity plot
Sequence statistics
Bioinformatics explained: Protein statistics
Join sequences
Pattern discovery
Pattern discovery search parameters
Pattern search output
Motif Search
Dynamic motifs
Motif search from the Toolbox
Java regular expressions
Create motif list
Nucleotide analyses
Convert DNA to RNA
Convert RNA to DNA
Reverse complements of sequences
Reverse sequence
Translation of DNA or RNA to protein
Find open reading frames
Open reading frame parameters
Protein analyses
Protein charge
Antigenicity
Hydrophobicity
Hydrophobicity graphs along sequence
Bioinformatics explained: Protein hydrophobicity
Pfam domain search
Download of Pfam database
Running Pfam Domain Search
Secondary structure prediction
Protein report
Reverse translation from protein into DNA
Bioinformatics explained: Reverse translation
Proteolytic cleavage detection
Bioinformatics explained: Proteolytic cleavage
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Standard PCR
When a single primer region is defined
When both forward and reverse regions are defined
Standard PCR output table
Nested PCR
TaqMan
Sequencing primers
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Sequencing data analyses
Importing and viewing trace data
Trace settings in the Side Panel
Trim sequences
Trimming using the Trim tool
Manual trimming
Assemble sequences
Assemble sequences to reference
Sort sequences by name
Add sequences to an existing contig
View and edit contigs and read mappings
View settings in the Side Panel
Editing a contig or read mapping
Sorting reads
Read conflicts
Using the mapping
Extract parts of a mapping
Variance table
Reassemble contig
Secondary peak calling
Cutting and cloning
Restriction site analyses
Dynamic restriction sites
Restriction Site Analysis
Restriction enzyme lists
Molecular cloning
Introduction to the cloning editor
The cloning workflow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Gel electrophoresis
Gel view
Sequence alignment
Create an alignment
Gap costs
Fast or accurate alignment algorithm
Aligning alignments
Fixpoints
View alignments
Bioinformatics explained: Sequence logo
Edit alignments
Realignment
Join alignments
Pairwise comparison
The pairwise comparison table
Bioinformatics explained: Multiple alignments
Phylogenetic trees
K-mer Based Tree Construction
Create tree
Model Testing
Maximum Likelihood Phylogeny
Bioinformatics explained
Tree Settings
Minimap
Tree layout
Node settings
Label settings
Background settings
Branch layout
Bootstrap settings
Visualizing metadata
Node right click menu
Metadata and phylogenetic trees
Table Settings and Filtering
Add or modify metadata on a tree
Undefined metadata values on a tree
Selection of specific nodes
RNA structure
RNA secondary structure prediction
Selecting sequences for prediction
Secondary structure prediction parameters
Structure as annotation
View and edit secondary structures
Graphical view and editing of secondary structure
Tabular view of structures and energy contributions
Symbolic representation in sequence view
Probability-based coloring
Evaluate structure hypothesis
Selecting sequences for evaluation
Probabilities
Structure scanning plot
Selecting sequences for scanning
The structure scanning result
Bioinformatics explained: RNA structure prediction by minimum free energy minimization
The algorithm
Structure elements and their energy contribution
Trimming, multiplexing and sequencing quality control
Trim Reads
Quality trimming
Adapter trimming
Trim adapter list
Length trimming
Trim output
Demultiplex reads
An example using Illumina barcoded sequences
Sequencing data quality control
QC Sequencing Report Content
Adapters
Running the quality control tool
Merge overlapping pairs
Using quality scores when merging
Report of merged pairs
Tracks
Track lists
Zooming and navigating track views
Adding, removing and reordering tracks
Showing a track in a table
Open track from a track list in table view
Finding annotations on the genome
Extract sequences from tracks
Creating track lists in workflows
Retrieving reference data tracks
Merging tracks
Converting data to tracks and back
Convert to tracks
Convert from tracks
Annotate and filter tracks
Annotate with overlap information
Extract reads based on overlap
Filter annotations on name
Filter Based on Overlap
Graphs
Create GC Content Graph
Create Mapping Graph
Identify Graph Threshold Areas
Read mapping
Map Reads to Reference
Selecting reads and reference
Including or excluding regions (masking)
Mapping parameters
Mapping paired reads
Non-specific matches
Gap placement
Mapping computational requirements
Reference caching
Mapping output
Mapping output options
Mapped reads coloring
Reads track output from a read mapping
Stand-alone read mapping
Mapping table
Mapping view settings
Overlapping paired reads
Find broken pair mates
Mapping reports
Summary mapping report
Detailed mapping report
Mapping SOLid reads in color space
Viewing color space information
Mapping in color space
Local realignment
Method
Realignment of unaligned ends
Guided realignment
Multi-pass local realignment
Known limitations
Computational requirements
How to run the Local Realignment tool
Merge mapping results
Remove duplicate mapped reads
Algorithm details and parameters
Running the duplicate reads removal
Extract consensus sequence
Sample reads
Resequencing
Create Statistics for Target Regions
Running the Create Statistics for Target Regions
Coverage summary report
Per-region statistics
Coverage table
Coverage graph
InDels and Structural Variants
How to run the InDels and Structural Variants tool
The Structural Variants and InDels output
The InDels and Structural Variants detection algorithm
The InDels and Structural Variants detection algorithm - Step 1: Creating Left- and Right breakpoint signatures
The InDels and Structural Variants detection algorithm - Step 2: Creating Structural variant signatures
Theoretically expected structural variant signatures
How sequence complexity is calculated
Coverage analysis
Variant Detectors - overview
Differences in the variants called by the different tools
How the variant detection tools work
Fixed Ploidy Variant Detection
Ploidy and sensitivity
Low Frequency Variant Detection
Basic Variant Detection
Variant Detectors - error model estimation
Variant Detectors - filters
General filters
Noise filters
Variant Detectors - the outputs
The variant track output
The annotated table output
The report
The Fixed Ploidy and Low Frequency variant callers: detailed descriptions
The Fixed Ploidy Variant Caller: Models and methods
The Low Frequency Variant caller: Models and methods
Variant data
Variant tracks
The annotated variant table
Variant types
Detailed information about overlapping paired reads
Annotate and filter variants
Filter against known variants
Annotating from known variants
Annotate with exon numbers
Annotate with flanking sequence
Identify candidate variants
Filter marginal variant calls
Filter reference variants
Comparing variants
Compare variants within group
Compare sample variants
Fisher exact test
Trio analysis
Filter Against Control Reads
Predicting functional consequences
Amino acid changes
Predict splice site effect
GO enrichment analysis
Conservation score annotation
Link Variants to 3D Protein Structure
Download 3D Protein Structure Database
Identify Known Mutations from Sample Mappings
How to run the Identify Known Mutations from Sample Mappings tool
Output from the Identify Known Mutations from Sample Mappings tool
RNA-Seq Analysis tools
RNA-Seq analysis
Specifying reads and reference
Defining mapping options for RNA-Seq
The EM estimation algorithm
Calculating expression values from RNA-Seq
Specifying RNA-Seq outputs
Expression tracks
Reads track
RNA-Seq report
Gene fusion reporting
Create Combined RNA-Seq Report
Advanced RNA-Seq Tools
TMM Normalization
Metadata for RNA-Seq
PCA for RNA-Seq
Principal component analysis plot (2D)
Principal component analysis plot (3D)
Differential Expression for RNA-Seq
The statistical model
Output of the Differential Expression for RNA-Seq tool
Statistical comparison tracks
The volcano plot
Create Heat Map for RNA-Seq
Clustering of features and samples
The heat map view
Create Expression Browser
The expression browser
Create Venn Diagram for RNA-Seq
Venn diagram table view
Gene Set Test
Microarray and Small RNA Analysis
Small RNA analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Experimental design
Setting up an experiment
Organization of the experiment table
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Working with tracks and experiments
Data structures for transcriptomics
Running the Create Track from Experiment tool
Interpreting the results of the Create Track from Experiment tool
Visualizing RNA-Seq read tracks for the experiment
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Creating box plots - analyzing distributions
Hierarchical clustering of samples
Principal component analysis
Statistical analysis - identifying differential expression
Empirical analysis of DGE
Tests on proportions
Gaussian-based tests
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Annotation tests
Hypergeometric tests on annotations
Gene set enrichment analysis
General plots
Histogram
MA plot
Scatter plot
De novo sequencing
De novo assembly
Best practices
How it works
Resolve repeats using reads
Automatic paired distance estimation
Optimization of the graph using paired reads
AGP export
Bubble resolution
Converting the graph to contig sequences
Summary
Randomness in the results
SOLiD data support in de novo assembly
De novo assembly parameters
De novo assembly report
Map Reads to Contigs
Epigenomics
ChIP-Seq Analysis
Quality Control of ChIP-Seq data
Learning peak shapes
Applying peak shape filters to call peaks
Running the Transcription Factor ChIP-Seq tool
Annotate with nearby gene information
Legacy tools
Import Roche 454
Import SOLiD
Appendix
Use of multi-core computers
Graph preferences
BLAST databases
Peptide sequence databases
Nucleotide sequence databases
Adding more databases
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
List of graphics data formats
SAM/BAM export format specification
Flags
Gene expression annotation files and microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
21. Trematode Mitochondrial
22. Scenedesmus Obliquus Mitochondrial
23. Thraustochytrium Mitochondrial
24. Pterobranchia Mitochondrial
25. Candidate Division SR1 and Gracilibacteria
Custom codon frequency tables
Comparison of track comparison tools
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
Restriction sites
See
Restriction sites in the Side Panel
.