Browse the manual
Introduction to CLC Cancer Research Workbench
Contact information
System requirements
Limitations on maximum number of cores
Workbench Licenses
Request an evaluation license
Download a license using a license order ID
Import a license from a file
Upgrade license
Configure license server connection
Download a static license on a non-networked machine
Limited mode
About CLC Workbenches
New program feature request
Getting help
When the program is installed: Getting started
Import of example data
Plugins
Installing plugins
Uninstalling plugins
Updating plugins
Resources
Network configuration
User interface
View Area
Open view
Show element in another view
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Moving a view to a different screen
Side Panel
Zoom and selection in View Area
Zoom in
Zoom out
Selecting, panning and zooming
Toolbox and Status Bar
Processes
Toolbox
Status Bar
Workspace
Create Workspace
Select Workspace
Delete Workspace
List of shortcuts
Data organization and management
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete, restore and remove elements
Show folder elements in a table
Customized attributes on data locations
Configuring which fields should be available
Editing lists
Removing attributes
Changing the order of the attributes
Filling in values
What happens when a clc object is copied to another data location?
Searching
Import your own reference data
Retrieving reference data tracks
Sequence web info
Google sequence
NCBI
PubMed References
UniProt
Additional annotation information
User preferences and settings
General preferences
Default view preferences
Number formatting in tables
Import and export Side Panel settings
Data preferences
Advanced preferences
Default data location
Export/import of preferences
The different options for export and import
View settings for the Side Panel
Saving, removing and applying saved settings
Printing
Selecting which part of the view to print
Page setup
Header and footer
Print preview
Import/export of data and graphics
Standard import
Import using the import dialog
Import using drag and drop
Import using copy/paste of text
External files
Import tracks
Import high-throughput sequencing data
454 from Roche Applied Science
Illumina
SOLiD from Life Technologies
Fasta read files
Sanger sequencing data
Ion Torrent PGM from Life Technologies
Complete Genomics
General notes on handling paired data
SAM and BAM mapping files
Import Primer Pairs
Data export
Export of folders and multiple elements in CLC format
Export of dependent elements
Export history
The CLC format
Backing up data from the CLC Workbench
Export of workflow output
Export of tables
Export graphics to files
Which part of the view to export
Save location and file formats
Graphics export parameters
Exporting protein reports
Export graph data points to a file
Copy/paste view output
History log
Element history
Sharing data with history
Batching and result handling
Batch processing
Batch overview
Batch filtering and counting
Setting parameters for batch runs
Running the analysis and organizing the results
How to handle results of analyses
Table outputs
Batch log
Working with tables
Filtering tables
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Dynamic motifs
Restriction sites in the Side Panel
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Sequence Lists
Graphical view of sequence lists
Sequence list table
Extract sequences from sequence list
Viewing structures
Importing molecule structure files
Viewing molecular structures in 3D
Moving and rotating
Troubleshooting 3D graphics errors
Customizing the visualization
Visualization styles and colors
Project settings
Snapshots of the molecule visualization
Tools for linking sequence and structure
Show sequence associated with molecule
Link sequence or sequence alignment to structure
Transfer annotations between sequence and structure
Protein structure alignment
The Align Protein Structure dialog box
Example: alignment of calmodulin
The Align Protein Structure algorithm
Getting started
Reference data
The Workbench Reference data location
Space requirements
Where reference data is downloaded from
Download and configure reference data
Troubleshooting reference data downloads
Create new folder
Import data
How to import data
Preparing Raw Data
Prepare sequencing data - all application types
Import adapter trim list
How to run the 'Prepare Overlapping Raw Data' ready-to-use workflow
How to run the 'Prepare Raw Data' ready-to-use workflow
Output from the Prepare Overlapping Raw Data and Prepare Raw Data workflows
How to check the output reports
Analysis of sequencing data
Whole genome sequencing (WGS)
Automatic analysis of sequencing data (WGS)
Identify Variants (WGS)
How to run the 'Identify Variants' ready-to-use workflow
Output from the Identify Variants workflow
Annotate Variants (WGS)
Filter Somatic Variants (WGS)
Identify Somatic Variants from Tumor Normal Pair (WGS)
Identify Known Variants in One Sample (WGS)
Import your known variants
Import your targeted regions
How to run the 'Identify Known Variants in One Sample' ready-to-use workflow
Output from the Identify Known Variants in One Sample
Whole exome sequencing (WES)
Automatic analysis of sequencing data (WES)
Identify Variants (WES)
Import your targeted regions
How to run the 'Identify Variants' ready-to-use workflow
Output from the Identify Variants workflow
Annotate Variants (WES)
Filter Somatic Variants (WES)
Identify Somatic Variants from Tumor Normal Pair (WES)
Import your targeted regions
How to run the 'Identify Somatic Variants from Tumor Normal Pair' ready-to-use workflow
Identify Known Variants in One Sample (WES)
Import your known variants
Import your targeted regions
How to run the 'Identify Known Variants in One Sample' ready-to-use workflow
Output from the Identify Known Variants in One Sample
Identify and Annotate Variants (WES)
Import your targeted regions
How to run the 'Identify and Annotate Variants' ready-to-use workflow
Output from the Identify and Annotate Variants workflow
Targeted amplicon sequencing (TAS)
Automatic analysis of sequencing data (TAS)
Identify Variants (TAS)
Import your targeted regions
How to run the 'Identify Variants' ready-to-use workflow
Output from the Identify Variants workflow
Annotate Variants (TAS)
Filter Somatic Variants (TAS)
Identify Somatic Variants from Tumor Normal Pair (TAS)
Import your targeted regions
How to run the 'Identify Somatic Variants from Tumor Normal Pair' ready-to-use workflow
Identify Known Variants in One Sample (TAS)
Import your known variants
Import your targeted regions
How to run the 'Identify Known Variants in One Sample' ready-to-use workflow
Output from the Identify Known Variants in One Sample
Identify and Annotate Variants (TAS)
Import your targeted regions
How to run the 'Identify and Annotate Variants' ready-to-use workflow
Output from the Identify and Annotate Variants workflow
Whole Transcriptome Sequencing (WTS)
Automatic analysis of RNA-seq data
Analysis of multiple samples
Annotate Variants (WTS)
Compare variants in DNA and RNA
Identify Candidate Variants and Genes from Tumor Normal Pair
Identify variants and add expression values
Identify and Annotate Differentially Expressed Genes and Pathways
Using data from other workbenches
Open outputs from other workbenches
Genome browser tools
Create new genome browser view
Genome browser view
Adding ideogram to Genome Browser View
Zooming and navigating the genome browser views
Adding, removing and reordering tracks
Showing a track in a table
Open track from a track list in table view
Finding annotations on the genome
Extract sequences from tracks
Creating track lists in workflows
Creating graph tracks
Quality control tools
QC for Target Sequencing
Running the 'QC for Target Sequencing' tool
Coverage summary report
Per-region statistics
Coverage table
Coverage graph
QC for Sequencing Reads
Report contents
Adapters
Running the 'QC for Sequencing Reads' tool
QC for Read Mapping
Running the 'QC for Read Mapping' tool
Summary mapping report
Preparing raw data tools
Merge overlapping pairs
Using quality scores when merging
Report of merged pairs
Trim Sequences
Quality trimming
Adapter trimming
Length trimming
Trim output
Demultiplex reads
Resequencing analysis tools
Identify Known Mutations from Sample Mappings
Input and Parameters
Output from the 'Identify Known Mutations from Sample Mappings' tool
How to run the 'Identify Known Mutations from Sample Mappings' tool
Trim primers of mapped reads
Extract reads based on overlap
Map Reads to Reference
Selecting reads and reference
Including or excluding regions (masking)
Mapping parameters
Gap placement
Computational requirements
Reference Caching
Mapping output options
Color space
Sequencing
Error modes
Mapping in color space
Viewing color space information
Mapping result
View settings in the Side Panel
Local realignment
Method
Realignment of unaligned ends
Guided Realignment
Multi-pass local realignment
Known Limitations
Computational Requirements
How to run the Local Realignment tool
Merge mapping results
Remove duplicate mapped reads
Algorithm details and parameters
Running the duplicate reads removal
Coverage analysis
Running the Coverage analysis tool
Variant Detectors - overview
Differences among the variants called by the three variant callers
How the variant detectors work
Basic Variant Detection
Fixed Ploidy Variant Detection
Ploidy and sensitivity
Low Frequency Variant Detection
Variant Detectors - error model estimation
Variant Detectors - filters
General filters
Noise filters
Variant Detectors - the outputs
The variant track output
The annotated table output
The report
The Fixed Ploidy and Low Frequency variant callers: detailed descriptions
The Fixed Ploidy Variant caller: Models and methods
The Low Frequency Variant caller: Models and methods
InDels and Structural Variants
How to run the InDels and Structural Variants tool
The Structural Variants and InDels output
The InDels and Structural Variants detection algorithm
The InDels and Structural Variants detection algorithm - Step 1: Creating Left- and Right breakpoint signatures
The InDels and Structural Variants detection algorithm - Step 2: Creating Structural variant signatures
Theoretically expected structural variant signatures
How sequence complexity is calculated
Variant data
Variant tracks
The annotated variant table
Variant types
Detailed information about overlapping paired reads
Add information to variants tools
Add information from variant databases
Add conservation scores
Add exon number
Add flanking sequence
Add fold changes
Add information about amino acid changes
Add information from genomic regions
Add information from overlapping genes
Link Variants to 3D Protein Structure
How the protein structures are found
Create 3D visualization of variant
Method details
Download 3D Protein Structure Database
From databases
Add information from 1000 Genomes Project
Add information from COSMIC
Add information from ClinVar
Add information from common dbSNP
Add information from HapMap
Add information from dbSNP
Remove variants tools
Remove variants found in external database
Remove variants not found in external database
Remove false positives
Remove Germline Variants
Remove reference variants
Remove variants inside genome regions
Remove variants outside genome regions
Remove variants outside targeted regions
From databases
Remove variants found in 1000 genomes project
Remove variants found in common dbSNP
Remove variants found in HapMap
Add information to genes tool
Add information from overlapping variants
Compare samples tools
Compare shared variants within a group of samples
Identify Enriched Variants in Case vs Control Group
Trio analysis
Identify candidate variants tools
Create Filter Criteria
Identify candidate variants
Remove information from variants
Identify variants with effect on splicing
Identify candidate genes tools
Identify differentially expressed gene groups and pathways
Identify highly mutated gene groups and pathways
Identify mutated genes
Select genes by name
Transcriptomics tools
RNA-Seq analysis
Specifying reads and reference
Tightly packed genes and genes in operons
Defining mapping options for RNA-Seq
Calculating expression values from RNA-Seq
Specifying RNA-Seq outputs
Interpreting the RNA-Seq analysis result
Create fold change track
Small RNA analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Experimental design
Setting up an experiment
Organization of the experiment table
Visualizing RNA-Seq read tracks for the experiment
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Working with tracks and experiments
Data structures for transcriptomics
From Tracks to Experiments
From Experiments to Tracks
Running the Extract Differentially Expressed Genes tool
Interpreting the results of the Extract Differentially Expressed Genes tool
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Creating box plots - analyzing distributions
Hierarchical clustering of samples
Principal component analysis
Statistical analysis - identifying differential expression
Empirical analysis of DGE
Tests on proportions
Gaussian-based tests
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Annotation tests
Hypergeometric tests on annotations
Gene set enrichment analysis
General plots
Histogram
MA plot
Scatter plot
Helper tools
Extract sequences
Cloning and cutting
Molecular cloning
Introduction to the cloning editor
The cloning workflow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Restriction site analysis
Dynamic restriction sites
Restriction site analysis from the Toolbox
Gel electrophoresis
Separate fragments of sequences on gel
Separate sequences on gel
Gel view
Restriction enzyme lists
Create enzyme list
View and modify enzyme list
Sequencing Data Analysis
Importing and viewing trace data
Scaling traces
Trace settings in the Side Panel
Trim sequences
Trimming using the Trim tool
Manual trimming
Assemble sequences
Sort sequences by name
Assemble sequences to reference
Add sequences to an existing contig
View and edit read mappings
View settings in the Side Panel
Editing the read mapping
Sorting reads
Read conflicts
Extract parts of a mapping
Variance table
Reassemble contig
Secondary peak calling
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Saving primers
Saving PCR fragments
Adding primer binding annotation
Standard PCR
User input
Standard PCR output table
Nested PCR
Nested PCR output table
TaqMan
TaqMan output table
Sequencing primers
Sequencing primers output table
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Epigenomics
ChIP-Seq Analysis
Quality Control of ChIP-seq data
Learning peak shapes
Applying peak shape filters to call peaks
Running the ChIP-Seq Analysis tool
Annotate with nearby gene information
Workflows
Creating a workflow
Adding workflow elements
Configuring workflow elements
Locking and unlocking parameters
Connecting workflow elements
Input and output
Layout
Input modifying tools
Workflow validation
Workflow creation helper tools
Adding to workflows
Snippets in workflows
Supported data flows
Distributing and installing workflows
Creating a workflow installation file
Installing a workflow
Workflow identification and versioning
Automatic update of workflow elements
Executing a workflow
Open copy of ready-to-use workflow
Legacy tools
Quality-based variant detection
Assessing the quality of the neighborhood bases
Significance of variant
Ploidy and genetic code
Reporting the variants
Probabilistic variant detection
Calculation of the prior and error probabilities
Calculation of the likelihood
Calculation of the posterior probability for each site type at each position in the genome
Comparison with the reference sequence and identification of candidate variants
Posterior filtering and reporting of variants
Running the variant detection
Setting ploidy and genetic code
Reporting the variants found
Appendix
Use of multi-core computers
Reference data overview
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
List of graphics data formats
SAM/BAM export format specification
Flags
Gene expression annotation files and microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
21. Trematode Mitochondrial
22. Scenedesmus Obliquus Mitochondrial
23. Thraustochytrium Mitochondrial
24. Pterobranchia Mitochondrial
25. Candidate Division SR1 and Gracilibacteria
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
Adding primer binding annotation
You can add an annotation to the template sequence specifying the binding site of the primer: Right-click the primer in the table and select
Mark primer annotation on sequence
.