Manuals
Browse the manual
Introduction to Biomedical Genomics Workbench
Contact information
Download and installation
Program download
Installation on Microsoft Windows
Installation on Mac OS X
Installation on Linux with an installer
System requirements
Limitations on maximum number of cores
Workbench Licenses
Request an evaluation license
Download a license using a license order ID
Import a license from a file
Upgrade license
Configure license server connection
Download a static license on a non-networked machine
Limited mode
About CLC Workbenches
New program feature request
Getting help
When the program is installed: Getting started
Import of example data
Plugins
Installing plugins
Uninstalling plugins
Updating plugins
Network configuration
User interface
View Area
Open view
Open additional views with the toolbar
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Moving a view to a different screen
Side Panel
Zoom and selection in View Area
Zoom in
Zoom out
Selecting, panning and zooming
Toolbox and Status Bar
Processes
Toolbox
Status Bar
Workspace
Create Workspace
Select Workspace
Delete Workspace
List of shortcuts
Data organization
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete, restore and remove elements
Show folder elements in a table
Metadata
Importing Metadata
Associating data elements with metadata
Working with data and metadata
Working with tables
Filtering tables
Customized attributes on data locations
Configuring which fields should be available
Editing lists
Removing attributes
Changing the order of the attributes
Filling in values
What happens when a clc object is copied to another data location?
Searching
Local search
What kind of information can be searched?
Quick search
Advanced search
Search index
Import your own reference data
Sequence web info
User preferences and settings
General preferences
View preferences
Import and export Side Panel settings
Data preferences
Advanced preferences
Default data location
Export/import of preferences
The different options for export and import
View settings for the Side Panel
Saving, removing and applying saved settings
Printing
Selecting which part of the view to print
Page setup
Print preview
Import/export of data and graphics
Standard import
Import using the import dialog
Import using drag and drop
Import using copy/paste of text
External files
Import tracks
Import high-throughput sequencing data
Roche 454
Illumina
SOLiD
Fasta read files
Sanger sequencing data
Ion Torrent
Complete Genomics
General notes on handling paired data
SAM and BAM mapping files
Import Primer Pairs
Data export
Export of folders and multiple elements in CLC format
Export of dependent elements
Export history
The CLC format
Backing up data from the CLC Workbench
Export of workflow output
Export of tables
Export graphics to files
Which part of the view to export
Save location and file formats
Graphics export parameters
Exporting protein reports
Export graph data points to a file
Copy/paste view output
Running tools, handling results and batching
Running tools
Handling results
Batch processing
Standard batch processing
Batch overview
Parameters for batch runs
Running the analysis and organizing the results
Batch launching workflows with multiple inputs
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Create motif list
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Sequence Lists
Graphical view of sequence lists
Sequence list table
Extract sequences from sequence list
Viewing structures
Importing molecule structure files
Import issues
Viewing molecular structures in 3D
Moving and rotating
Troubleshooting 3D graphics errors
Customizing the visualization
Visualization styles and colors
Project settings
Snapshots of the molecule visualization
Tools for linking sequence and structure
Show sequence associated with molecule
Link sequence or sequence alignment to structure
Transfer annotations between sequence and structure
Protein structure alignment
The Align Protein Structure dialog box
Example: alignment of calmodulin
The Align Protein Structure algorithm
Ready-to-Use Workflows descriptions and guidelines
General Workflow
Somatic Cancer
Hereditary Disease
Getting started
Reference data
The Workbench Reference data location
Space requirements
Where reference data is downloaded from
Download and configure reference data
Create a custom Reference Data Set
Exporting reference data for use in external applications
Troubleshooting reference data downloads
Create new folder
Import sequencing data
How to import data
Prepare sequencing data
Import adapter trim list
How to run the Prepare Overlapping Raw Data ready-to-use workflow
How to run the Prepare Raw Data ready-to-use workflow
Output from the Prepare Overlapping Raw Data and Prepare Raw Data workflows
How to check the output reports
Whole genome sequencing (WGS)
General Workflows (WGS)
Annotate Variants (WGS)
Identify Known Variants in One Sample (WGS)
Somatic Cancer (WGS)
Filter Somatic Variants (WGS)
Identify Somatic Variants from Tumor Normal Pair (WGS)
Identify Variants (WGS)
Hereditary Disease (WGS)
Filter Causal Variants (WGS-HD)
Identify Causal Inherited Variants in Family of Four (WGS)
Identify Causal Inherited Variants in Trio (WGS)
Identify Rare Disease Causing Mutations in Family of Four (WGS)
Identify Rare Disease Causing Mutations in Trio (WGS)
Identify Variants (WGS-HD)
Whole exome sequencing (WES)
General Workflows (WES)
Annotate Variants (WES)
Identify Known Variants in One Sample (WES)
Somatic Cancer (WES)
Filter Somatic Variants (WES)
Identify Somatic Variants from Tumor Normal Pair (WES)
Identify Variants (WES)
Identify and Annotate Variants (WES)
Hereditary Disease (WES)
Filter Causal Variants (WES-HD)
Identify Causal Inherited Variants in Family of Four (WES)
Identify Causal Inherited Variants in Trio (WES)
Identify Rare Disease Causing Mutations in Family of Four (WES)
Identify Rare Disease Causing Mutations in Trio (WES)
Identify Variants (WES-HD)
Identify and Annotate Variants (WES-HD)
Targeted amplicon sequencing (TAS)
General Workflows (TAS)
Annotate Variants (TAS)
Identify Known Variants in One Sample (TAS)
Somatic Cancer (TAS)
Filter Somatic Variants (TAS)
Identify Somatic Variants from Tumor Normal Pair (TAS)
Identify Variants (TAS)
Identify and Annotate Variants (TAS)
Hereditary Disease (TAS)
Filter Causal Variants (TAS-HD)
Identify Causal Inherited Variants in Family of Four (TAS)
Identify Causal Inherited Variants in Trio (TAS)
Identify Rare Disease Causing Mutations in Family of Four (TAS)
Identify Rare Disease Causing Mutations in Trio (TAS)
Identify Variants (TAS-HD)
Identify and Annotate Variants (TAS-HD)
Whole Transcriptome Sequencing (WTS)
Analysis of multiple samples
Annotate Variants (WTS)
Compare variants in DNA and RNA
Identify Candidate Variants and Genes from Tumor Normal Pair
Identify variants and add expression values
Identify and Annotate Differentially Expressed Genes and Pathways
Genome browser
Create new genome browser view
Genome browser view tools
Adding ideogram to Genome Browser View
Zooming and navigating the genome browser view
Adding, removing and reordering tracks
Showing a track in a table
Open track from a track list in table view
Finding annotations on the genome
Extract sequences from tracks
Creating track lists in workflows
Graphs
Create GC Content Graph
Create Mapping Graph
Identify Graph Threshold Areas
Quality control tools
QC for Target Sequencing
Running the 'QC for Target Sequencing' tool
Coverage summary report
Per-region statistics
Coverage table
Coverage graph
QC for Sequencing Reads
Qc Sequencing Report Content
Adapters
Running the 'QC for Sequencing Reads' tool
QC for Read Mapping
Running the 'QC for Read Mapping' tool
Preparing raw data tools
Merge overlapping pairs
Using quality scores when merging
Report of merged pairs
Trim Sequences
Quality trimming
Adapter trimming
Length trimming
Trim output
Demultiplex reads
Resequencing analysis tools
Map Reads to Reference
Selecting reads and reference
Including or excluding regions (masking)
Mapping parameters
Mapping paired reads
Non-specific matches
Gap placement
Computational requirements
Reference Caching
Mapping output
Mapping output options
Mapped reads coloring
Reads track output from a read mapping
Summary mapping report
Mapping SOLid reads in color space
Viewing color space information
Mapping in color space
Local realignment
Method
Realignment of unaligned ends
Guided Realignment
Multi-pass local realignment
Known Limitations
Computational Requirements
How to run the Local Realignment tool
Merge mapping results
Remove duplicate mapped reads
Algorithm details and parameters
Running the duplicate reads removal
Trim primers of mapped reads
Extract reads based on overlap
InDels and Structural Variants
How to run the InDels and Structural Variants tool
The Structural Variants and InDels output
The InDels and Structural Variants detection algorithm
The InDels and Structural Variants detection algorithm - Step 1: Creating Left- and Right breakpoint signatures
The InDels and Structural Variants detection algorithm - Step 2: Creating Structural variant signatures
Theoretically expected structural variant signatures
How sequence complexity is calculated
Copy Number Variant Detection
Running the Copy Number Variant Detection tool
Region-level CNV track (Region CNVs)
Target-level CNV track (Target CNVs)
Gene-level annotation track (Gene CNVs)
CNV results report
CNV algorithm report
Coverage analysis
Running the Coverage analysis tool
Variant Detectors - overview
Differences in the variants called by the different tools
How the variant detection tools work
Fixed Ploidy Variant Detection
Ploidy and sensitivity
Low Frequency Variant Detection
Basic Variant Detection
Variant Detectors - error model estimation
Variant Detectors - filters
General filters
Noise filters
Variant Detectors - the outputs
The variant track output
The annotated table output
The report
The Fixed Ploidy and Low Frequency variant callers: detailed descriptions
The Fixed Ploidy Variant Caller: Models and methods
The Low Frequency Variant caller: Models and methods
Variant data
Variant tracks
The annotated variant table
Variant types
Detailed information about overlapping paired reads
Identify Known Mutations from Sample Mappings
Input and Parameters
Output from the 'Identify Known Mutations from Sample Mappings' tool
How to run the 'Identify Known Mutations from Sample Mappings' tool
Add information to variants tools
Add information from variant databases
Add conservation scores
Add exon number
Add flanking sequence
Add fold changes
Add information about amino acid changes
Add information from genomic regions
Add information from overlapping genes
Link Variants to 3D Protein Structure
Method details
Download 3D Protein Structure Database
From databases
Add information from 1000 Genomes Project
Add information from COSMIC
Add information from ClinVar
Add information from common dbSNP
Add information from HapMap
Add information from dbSNP
Remove variants tools
Remove variants found in external database
Remove variants not found in external database
Remove false positives
Remove Germline Variants
Remove reference variants
Remove variants inside genome regions
Remove variants outside genome regions
Remove variants outside targeted regions
From databases
Remove variants found in 1000 genomes project
Remove variants found in common dbSNP
Remove variants found in HapMap
Add information to genes tool
Add information from overlapping variants
Compare samples tools
Compare shared variants within a group of samples
Identify Enriched Variants in Case vs Control Group
Trio analysis
Identify candidate variants tools
Identify candidate variants
Remove information from variants
Identify variants with effect on splicing
Identify candidate genes tools
Identify differentially expressed gene groups and pathways
Identify highly mutated gene groups and pathways
Identify mutated genes
Select genes by name
Transcriptomics tools
RNA-Seq analysis
Specifying reads and reference
Tightly packed genes and genes in operons
Defining mapping options for RNA-Seq
Calculating expression values from RNA-Seq
Specifying RNA-Seq outputs
Interpreting the RNA-Seq analysis result
Create fold change track
Small RNA analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Experimental design
Setting up an experiment
Organization of the experiment table
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Working with tracks and experiments
Data structures for transcriptomics
From Tracks to Experiments
From Experiments to Tracks
Running the Extract Differentially Expressed Genes tool
Interpreting the results of the Extract Differentially Expressed Genes tool
Visualizing RNA-Seq read tracks for the experiment
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Creating box plots - analyzing distributions
Hierarchical clustering of samples
Principal component analysis
Statistical analysis - identifying differential expression
Empirical analysis of DGE
Tests on proportions
Gaussian-based tests
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Annotation tests
Hypergeometric tests on annotations
Gene set enrichment analysis
General plots
Histogram
MA plot
Scatter plot
Helper tools
Extract sequences
Filter Based on Overlap
Cloning and cutting
Molecular cloning
Introduction to the cloning editor
The cloning workflow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Restriction site analysis
Dynamic restriction sites
Restriction site analysis from the Toolbox
Gel electrophoresis
Separate fragments of sequences on gel
Separate sequences on gel
Gel view
Restriction enzyme lists
Create enzyme list
View and modify enzyme list
Sequencing Data Analysis
Importing and viewing trace data
Scaling traces
Trace settings in the Side Panel
Trim sequences
Trimming using the Trim tool
Manual trimming
Assemble sequences
Assemble sequences to reference
Sort sequences by name
Add sequences to an existing contig
View and edit contigs and read mappings
View settings in the Side Panel
Editing a contig or read mapping
Sorting reads
Read conflicts
Extract parts of a mapping
Variance table
Reassemble contig
Secondary peak calling
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Saving primers
Saving PCR fragments
Adding primer binding annotation
Standard PCR
User input
Standard PCR output table
Nested PCR
Nested PCR output table
TaqMan
TaqMan output table
Sequencing primers
Sequencing primers output table
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Epigenomics
ChIP-Seq Analysis
Quality Control of ChIP-Seq data
Learning peak shapes
Applying peak shape filters to call peaks
Running the Transcription Factor ChIP-Seq tool
Annotate with nearby gene information
Workflows
Creating a workflow
Adding workflow elements
Configuring workflow elements
Locking and unlocking parameters
Connecting workflow elements
Input and output
Layout
Input modifying tools
Workflow validation
Workflow creation helper tools
Adding to workflows
Snippets in workflows
Change the order of tracks in the Genome Browser View
Distributing and installing workflows
Creating a workflow installation file
Installing a workflow
Managing workflows
Workflow identification and versioning
Automatic update of workflow elements
Executing a workflow
Open copy of ready-to-use workflow
Legacy tools
Quality-based variant detection
Assessing the quality of the neighborhood bases
Significance of variant
Ploidy and genetic code
Reporting the variants
Probabilistic variant detection
Calculation of the prior and error probabilities
Calculation of the likelihood
Calculation of the posterior probability for each site type at each position in the genome
Comparison with the reference sequence and identification of candidate variants
Posterior filtering and reporting of variants
Running the variant detection
Setting ploidy and genetic code
Reporting the variants found
Appendix
Use of multi-core computers
Reference data overview
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
List of graphics data formats
SAM/BAM export format specification
Flags
Gene expression annotation files and microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
21. Trematode Mitochondrial
22. Scenedesmus Obliquus Mitochondrial
23. Thraustochytrium Mitochondrial
24. Pterobranchia Mitochondrial
25. Candidate Division SR1 and Gracilibacteria
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
Alignment formats
File type
Suffix
Import
Export
Description
Aligned fasta
.fa
X
X
Simple fasta-based format with
-
for gaps
CLC
.clc
X
X
Rich format including all information
ClustalW
.aln
X
X
GCG Alignment
.msf
X
X
Nexus
.nxs/.nexus
X
X
Phylip Alignment
.phy
X
X