Manuals
Browse the manual
Introduction to Biomedical Genomics Workbench
Contact information
Download and installation
Program download
Installation on Microsoft Windows
Installation on Mac OS X
Installation on Linux with an installer
System requirements
Limitations on maximum number of cores
Workbench Licenses
Request an evaluation license
Download a license using a license order ID
Import a license from a file
Upgrade license
Configure license server connection
Download a static license on a non-networked machine
Viewing mode
Start in safe mode
When the program is installed: Getting started
Plugins
Install
Uninstall
Updating plugins
Network configuration
User interface
View Area
Open view
History and Info views
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Moving a view to a different screen
Side Panel
Zoom and selection in View Area
Zoom in
Zoom out
Selecting, panning and zooming
Toolbox and Status Bar
Processes
Toolbox
Favorites
Status Bar
Workspace
List of shortcuts
Data organization
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete, restore and remove elements
Show folder elements in a table
Metadata
Importing Metadata
Advanced Metadata Import
Associating data elements with metadata
Working with data and metadata
Working with tables
Filtering tables
Customized attributes on data locations
Filling in values
What happens when a clc object is copied to another data location?
Searching
Local search
Quick search
Advanced search
User preferences and settings
General preferences
View preferences
Import and export Side Panel settings
Data preferences
Advanced preferences
Export/import of preferences
View settings for the Side Panel
Printing
Selecting which part of the view to print
Page setup
Print preview
Import/export of data and graphics
Standard import
External files
Import tracks
GFF3 format
Import high-throughput sequencing data
Illumina
PacBio
Fasta read files
Sanger sequencing data
Ion Torrent
Complete Genomics
General notes on handling paired data
SAM and BAM mapping files
Import RNA spike-in controls
Import Primer Pairs
Data export
Export of folders and multiple elements in CLC format
Export of dependent elements
Export history
The CLC format
Backing up data from the CLC Workbench
Export of tables
Export graphics to files
File formats
Export graph data points to a file
Copy/paste view output
Data download
SRA search
SRA search options
SRA search output
Downloading reads and metadata from SRA
How reads are downloaded
Sequence web info
Running tools, handling results and batching
Running tools
Handling results
Batch processing
Standard batch processing
Batch overview
Parameters for batch runs
Running the analysis and organizing the results
Batch launching workflows with multiple inputs
Workflows
Creating a workflow
Adding workflow elements
Configuring workflow elements
Locking and unlocking parameters
Connecting workflow elements
Output
Input
Layout
Input modifying tools
Workflow validation
Workflow creation helper tools
Adding to workflows
Snippets in workflows
Change the order of tracks in the Genome Browser View
Distributing and installing workflows
Creating a workflow installation file
Installing a workflow
Managing workflows
Workflow identification and versioning
Automatic update of workflow elements
Executing a workflow
Open copy of ready-to-use workflow
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Sequence Lists
Viewing structures
Importing molecule structure files
Viewing molecular structures in 3D
Customizing the visualization
Visualization styles and colors
Project settings
Tools for linking sequence and structure
Show sequence associated with molecule
Link sequence or sequence alignment to structure
Transfer annotations between sequence and structure
Protein structure alignment
The Align Protein Structure dialog box
Example: alignment of calmodulin
The Align Protein Structure algorithm
Ready-to-Use Workflows descriptions and guidelines
General Workflow
Somatic Cancer
Hereditary Disease
Reference data for ready-to-use workflows
Download and configure reference data
Create a custom Reference Data Set
Exporting reference data for use in external applications
Troubleshooting reference data downloads
Preparing raw data
Prepare Overlapping Raw Data (not recommended)
Prepare Raw Data (recommended)
Output from the Prepare Raw Data workflow
How to check the output reports
Whole genome sequencing (WGS)
General Workflows (WGS)
Annotate Variants (WGS)
Identify Known Variants in One Sample (WGS)
Somatic Cancer (WGS)
Filter Somatic Variants (WGS)
Identify Somatic Variants from Tumor Normal Pair (WGS)
Identify Variants (WGS)
Hereditary Disease (WGS)
Filter Causal Variants (WGS-HD)
Identify Causal Inherited Variants in Family of Four (WGS)
Identify Causal Inherited Variants in Trio (WGS)
Identify Rare Disease Causing Mutations in Family of Four (WGS)
Identify Rare Disease Causing Mutations in Trio (WGS)
Identify Variants (WGS-HD)
Whole exome sequencing (WES)
General Workflows (WES)
Annotate Variants (WES)
Identify Known Variants in One Sample (WES)
Somatic Cancer (WES)
Filter Somatic Variants (WES)
Identify Somatic Variants from Tumor Normal Pair (WES)
Identify Variants (WES)
Identify and Annotate Variants (WES)
Hereditary Disease (WES)
Filter Causal Variants (WES-HD)
Identify Causal Inherited Variants in Family of Four (WES)
Identify Causal Inherited Variants in Trio (WES)
Identify Rare Disease Causing Mutations in Family of Four (WES)
Identify Rare Disease Causing Mutations in Trio (WES)
Identify Variants (WES-HD)
Identify and Annotate Variants (WES-HD)
Targeted amplicon sequencing (TAS)
General Workflows (TAS)
Annotate Variants (TAS)
Identify Known Variants in One Sample (TAS)
Somatic Cancer (TAS)
Filter Somatic Variants (TAS)
Identify Somatic Variants from Tumor Normal Pair (TAS)
Identify Variants (TAS)
Identify and Annotate Variants (TAS)
Hereditary Disease (TAS)
Filter Causal Variants (TAS-HD)
Identify Causal Inherited Variants in Family of Four (TAS)
Identify Causal Inherited Variants in Trio (TAS)
Identify Rare Disease Causing Mutations in Family of Four (TAS)
Identify Rare Disease Causing Mutations in Trio (TAS)
Identify Variants (TAS-HD)
Identify and Annotate Variants (TAS-HD)
Whole Transcriptome Sequencing (WTS)
Analysis of multiple samples
Annotate Variants (WTS)
Compare variants in DNA and RNA
Identify Candidate Variants and Genes from Tumor Normal Pair
Identify variants and add expression values
Identify and Annotate Differentially Expressed Genes and Pathways
Genome browser
Create new genome browser view
Genome browser view tools
Adding ideogram to Genome Browser View
Zooming and navigating the genome browser view
Adding, removing and reordering tracks
Showing a track in a table
Open track from a track list in table view
Finding annotations on the genome
Extract sequences from tracks
Creating track lists in workflows
Graphs
Create GC Content Graph
Create Mapping Graph
Identify Graph Threshold Areas
Quality control tools
QC for Target Sequencing
Running the 'QC for Target Sequencing' tool
Coverage summary report
Per-region statistics
Coverage table
Coverage graph
QC for Sequencing Reads
QC Sequencing Report Content
Adapters
Running the 'QC for Sequencing Reads' tool
QC for Read Mapping
Running the 'QC for Read Mapping' tool
Preparing raw data tools
Merge overlapping pairs
Using quality scores when merging
Report of merged pairs
Trim Reads
Quality trimming
Adapter trimming
Trim adapter list
Length trimming
Trim output
Demultiplex reads
An example using Illumina barcoded sequences
Resequencing analysis tools
Map Reads to Reference
Selecting reads and reference
Including or excluding regions (masking)
Mapping parameters
Mapping paired reads
Non-specific matches
Gap placement
Mapping computational requirements
Reference caching
Mapping output
Mapping output options
Mapped reads coloring
Reads track output from a read mapping
Summary mapping report
Mapping SOLid reads in color space
Viewing color space information
Mapping in color space
Local realignment
Method
Realignment of unaligned ends
Guided realignment
Multi-pass local realignment
Known limitations
Computational requirements
How to run the Local Realignment tool
Merge mapping results
Remove duplicate mapped reads
Algorithm details and parameters
Running the duplicate reads removal
Extract reads based on overlap
InDels and Structural Variants
How to run the InDels and Structural Variants tool
The Structural Variants and InDels output
The InDels and Structural Variants detection algorithm
The InDels and Structural Variants detection algorithm - Step 1: Creating Left- and Right breakpoint signatures
The InDels and Structural Variants detection algorithm - Step 2: Creating Structural variant signatures
Theoretically expected structural variant signatures
How sequence complexity is calculated
Copy Number Variant Detection
Running the Copy Number Variant Detection tool
Region-level CNV track (Region CNVs)
Target-level CNV track (Target CNVs)
Gene-level annotation track (Gene CNVs)
CNV results report
CNV algorithm report
Coverage analysis
Variant Detectors - overview
Differences in the variants called by the different tools
How the variant detection tools work
Fixed Ploidy Variant Detection
Ploidy and sensitivity
Low Frequency Variant Detection
Basic Variant Detection
Variant Detectors - error model estimation
Variant Detectors - filters
General filters
Noise filters
Variant Detectors - the outputs
The variant track output
The annotated table output
The report
The Fixed Ploidy and Low Frequency variant callers: detailed descriptions
The Fixed Ploidy Variant Caller: Models and methods
The Low Frequency Variant caller: Models and methods
Variant data
Variant tracks
The annotated variant table
Variant types
Detailed information about overlapping paired reads
Identify Known Mutations from Sample Mappings
How to run the Identify Known Mutations from Sample Mappings tool
Output from the Identify Known Mutations from Sample Mappings tool
Add information to variants tools
Add information from variant databases
Add conservation scores
Add exon number
Add flanking sequence
Add fold changes
Add information about amino acid changes
Add information from genomic regions
Add information from overlapping genes
Link Variants to 3D Protein Structure
Method details
Download 3D Protein Structure Database
From databases
Add information from 1000 Genomes Project
Add information from COSMIC
Add information from ClinVar
Add information from common dbSNP
Add information from HapMap
Add information from dbSNP
Remove variants tools
Remove variants found in external database
Remove variants not found in external database
Remove false positives
Remove Germline Variants
Remove reference variants
Remove variants inside genome regions
Remove variants outside genome regions
Remove variants outside targeted regions
From databases
Remove variants found in 1000 genomes project
Remove variants found in common dbSNP
Remove variants found in HapMap
Add information to genes tool
Add information from overlapping variants
Compare samples tools
Compare shared variants within a group of samples
Identify Enriched Variants in Case vs Control Group
Trio analysis
Identify candidate variants tools
Identify candidate variants
Remove information from variants
Identify variants with effect on splicing
Identify candidate genes tools
Identify differentially expressed gene groups and pathways
Identify highly mutated gene groups and pathways
Identify mutated genes
Select genes by name
RNA-Seq Analysis tools
RNA-Seq analysis
Specifying reads and reference
Defining mapping options for RNA-Seq
The EM estimation algorithm
Calculating expression values from RNA-Seq
Specifying RNA-Seq outputs
Expression tracks
Reads track
RNA-Seq report
Gene fusion reporting
Create Combined RNA-Seq Report
Create fold change track
Advanced RNA-Seq Tools
TMM Normalization
Metadata for RNA-Seq
PCA for RNA-Seq
Principal component analysis plot (2D)
Principal component analysis plot (3D)
Differential Expression for RNA-Seq
The statistical model
Output of the Differential Expression for RNA-Seq tool
Statistical comparison tracks
The volcano plot
Create Heat Map for RNA-Seq
Clustering of features and samples
The heat map view
Create Expression Browser
The expression browser
Create Venn Diagram for RNA-Seq
Venn diagram table view
Gene Set Test
Microarray and Small RNA Analysis tools
Small RNA analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Experimental design
Setting up an experiment
Organization of the experiment table
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Working with tracks and experiments
Data structures for transcriptomics
Running the Extract Differentially Expressed Genes tool
Interpreting the results of the Extract Differentially Expressed Genes tool
Visualizing RNA-Seq read tracks for the experiment
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Creating box plots - analyzing distributions
Hierarchical clustering of samples
Principal component analysis
Statistical analysis - identifying differential expression
Empirical analysis of DGE
Tests on proportions
Gaussian-based tests
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Annotation tests
Hypergeometric tests on annotations
Gene set enrichment analysis
General plots
Histogram
MA plot
Scatter plot
Helper tools
Extract sequences
Filter Based on Overlap
Cutting and cloning
Restriction site analyses
Dynamic restriction sites
Restriction Site Analysis
Restriction enzyme lists
Molecular cloning
Introduction to the cloning editor
The cloning workflow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Gel electrophoresis
Gel view
Sequencing Data Analysis
Importing and viewing trace data
Trace settings in the Side Panel
Trim sequences
Trimming using the Trim tool
Manual trimming
Assemble sequences
Assemble sequences to reference
Sort sequences by name
Add sequences to an existing contig
View and edit contigs and read mappings
View settings in the Side Panel
Editing a contig or read mapping
Sorting reads
Read conflicts
Extract parts of a mapping
Variance table
Reassemble contig
Secondary peak calling
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Standard PCR
When a single primer region is defined
When both forward and reverse regions are defined
Standard PCR output table
Nested PCR
TaqMan
Sequencing primers
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Epigenomics
ChIP-Seq Analysis
Quality Control of ChIP-Seq data
Learning peak shapes
Applying peak shape filters to call peaks
Running the Transcription Factor ChIP-Seq tool
Annotate with nearby gene information
Legacy tools
Import Roche 454
Import SOLiD
Appendix
Use of multi-core computers
Reference data overview
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
List of graphics data formats
SAM/BAM export format specification
Flags
Gene expression annotation files and microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
21. Trematode Mitochondrial
22. Scenedesmus Obliquus Mitochondrial
23. Thraustochytrium Mitochondrial
24. Pterobranchia Mitochondrial
25. Candidate Division SR1 and Gracilibacteria
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
Mapping output
Subsections
Mapping output options
Mapped reads coloring
Reads track output from a read mapping