Manuals
Browse the manual
Introduction to CLC Genomics Workbench
Contact information
System requirements
Licenses
Request an evaluation license
Download a license
Import a license from a file
Upgrade license
Configure license server connection
Limited mode
About CLC Workbenches
New program feature request
Getting help
CLC Sequence Viewer vs. Workbenches
When the program is installed: Getting started
Quick start
Import of example data
Plug-ins
Installing plug-ins
Uninstalling plug-ins
Updating plug-ins
Resources
Network configuration
User interface
View Area
Open view
Show element in another view
Close views
Save changes in a view
Undo/Redo
Arrange views in View Area
Side Panel
Zoom and selection in View Area
Zoom In
Zoom Out
Fit Width
Zoom to 100%
Move
Selection
Changing compactness
Toolbox and Status Bar
Processes
Toolbox
Status Bar
Workspace
Create Workspace
Select Workspace
Delete Workspace
List of shortcuts
Data management and search
Navigation Area
Data structure
Create new folders
Sorting folders
Multiselecting elements
Moving and copying elements
Change element names
Delete elements
Show folder elements in a table
Customized attributes on data locations
Configuring which fields should be available
Editing lists
Removing attributes
Changing the order of the attributes
Filling in values
What happens when the sequence gets outside the data location?
Searching
Local search
What kind of information can be searched?
Quick search
Advanced search
Search index
User preferences and settings
General preferences
Default view preferences
Number formatting in tables
Import and export Side Panel settings
Data preferences
Advanced preferences
Default data location
NCBI BLAST
Export/import of preferences
The different options for export and importing
View settings for the Side Panel
Floating Side Panel
Printing
Selecting which part of the view to print
Page setup
Header and footer
Print preview
Import/export of data and graphics
Standard import
Import using the import dialog
Import using drag and drop
Import using copy/paste of text
External files
Import high-throughput sequencing data
454 from Roche Applied Science
Illumina
SOLiD from Life Technologies
Fasta format
Sanger sequencing data
Ion Torrent PGM from Life Technologies
Complete Genomics
General notes on handling paired data
SAM and BAM mapping files
Import tracks
Data export
Export graphics to files
Which part of the view to export
Save location and file formats
Graphics export parameters
Exporting protein reports
Export graph data points to a file
Copy/paste view output
History log
Element history
Sharing data with history
Batching and result handling
Batch processing
Batch overview
Batch filtering and counting
Setting parameters for batch runs
Running the analysis and organizing the results
Running de novo assembly and read mapping in batch
How to handle results of analyses
Table outputs
Batch log
Workflows
Creating a workflow
Adding workflow elements
Configuring workflow elements
Locking and unlocking parameters
Connecting workflow elements
Input and output
Workflow validation
Workflow creation helper tools
Supported data flows
Distributing and installing workflows
Creating a workflow installation file
Installing a workflow
Workflow identification and versioning
Executing a workflow
Viewing and editing sequences
View sequence
Sequence settings in Side Panel
Restriction sites in the Side Panel
Selecting parts of the sequence
Editing the sequence
Sequence region types
Circular DNA
Using split views to see details of the circular molecule
Mark molecule as circular and specify starting point
Working with annotations
Viewing annotations
Adding annotations
Edit annotations
Removing annotations
Element information
View as text
Creating a new sequence
Sequence Lists
Graphical view of sequence lists
Sequence list table
Extract sequences
Data download
GenBank search
GenBank search options
Handling of GenBank search results
Save GenBank search parameters
UniProt (Swiss-Prot/TrEMBL) search
UniProt search options
Handling of UniProt search results
Save UniProt search parameters
Search for structures at NCBI
Structure search options
Handling of NCBI structure search results
Save structure search parameters
Download reference genome
Selecting data types for download
Sequence web info
Google sequence
NCBI
PubMed References
UniProt
Additional annotation information
BLAST search
Running BLAST searches
BLAST at NCBI
BLAST a partial sequence against NCBI
BLAST against local data
BLAST a partial sequence against a local database
Output from BLAST searches
Graphical overview for each query sequence
Overview BLAST table
BLAST graphics
BLAST table
Extracting a consensus sequence from a BLAST result
Local BLAST databases
Make pre-formatted BLAST databases available
Download NCBI pre-formatted BLAST databases
Create local BLAST databases
Manage BLAST databases
Migrating from a previous version of the Workbench
Bioinformatics explained: BLAST
Examples of BLAST usage
Searching for homology
How does BLAST work?
Which BLAST program should I use?
Which BLAST options should I change?
Explanation of the BLAST output
I want to BLAST against my own sequence database, is this possible?
What you cannot get out of BLAST
Other useful resources
3D molecule viewing
Importing structure files
Viewing structure files
Moving and rotating
Selections and display of the 3D structure
Coloring of the 3D structure
Hierarchical view - changing how selections of the structure are displayed
3D Output
General sequence analyses
Shuffle sequence
Dot plots
Create dot plots
View dot plots
Bioinformatics explained: Dot plots
Realization of dot plots
Examples and interpretations of dot plots
Bioinformatics explained: Scoring matrices
Different scoring matrices
Use of scoring matrices
Local complexity plot
Sequence statistics
Bioinformatics explained: Protein statistics
Join sequences
Pattern Discovery
Pattern discovery search parameters
Pattern search output
Motif Search
Dynamic motifs
Motif search from the Toolbox
Java regular expressions
Create motif list
Nucleotide analyses
Convert DNA to RNA
Convert RNA to DNA
Reverse complements of sequences
Reverse sequence
Translation of DNA or RNA to protein
Translate part of a nucleotide sequence
Find open reading frames
Open reading frame parameters
Protein analyses
Signal peptide prediction
Signal peptide prediction parameter settings
Signal peptide prediction output
Bioinformatics explained: Prediction of signal peptides
Why the interest in signal peptides?
Different types of signal peptides
Prediction of signal peptides and subcellular localization
The SignalP method
What do the SignalP scores mean?
Protein charge
Modifying the layout
Transmembrane helix prediction
Antigenicity
Plot of antigenicity
Antigenicity graphs along sequence
Hydrophobicity
Hydrophobicity plot
Hydrophobicity graphs along sequence
Bioinformatics explained: Protein hydrophobicity
Hydrophobicity scales
Pfam domain search
Pfam search parameters
Download and installation of additional Pfam databases
Secondary structure prediction
Protein report
Protein report output
Reverse translation from protein into DNA
Reverse translation parameters
Bioinformatics explained: Reverse translation
The Genetic Code
Challenge of reverse translation
Solving the ambiguities of reverse translation
Proteolytic cleavage detection
Proteolytic cleavage parameters
Bioinformatics explained: Proteolytic cleavage
Primers
Primer design - an introduction
General concept
Scoring primers
Setting parameters for primers and probes
Primer Parameters
Graphical display of primer information
Compact information mode
Detailed information mode
Output from primer design
Saving primers
Saving PCR fragments
Adding primer binding annotation
Standard PCR
User input
Standard PCR output table
Nested PCR
Nested PCR output table
TaqMan
TaqMan output table
Sequencing primers
Sequencing primers output table
Alignment-based primer and probe design
Specific options for alignment-based primer and probe design
Alignment based design of PCR primers
Alignment-based TaqMan probe design
Analyze primer properties
Find binding sites and create fragments
Binding parameters
Results - binding sites and fragments
Order primers
Sequencing data analyses
Importing and viewing trace data
Scaling traces
Trace settings in the Side Panel
Trim sequences
Manual trimming
Automatic trimming
Assemble sequences
Assemble to reference sequence
Add sequences to an existing contig
View and edit read mappings
View settings in the Side Panel
Editing the read mapping
Sorting reads
Read conflicts
Output from the mapping
Extract parts of a mapping
Variance table
Reassemble contig
Secondary peak calling
Cloning and cutting
Molecular cloning
Introduction to the cloning editor
The cloning work flow
Manual cloning
Insert restriction site
Gateway cloning
Add attB sites
Create entry clones (BP)
Create expression clones (LR)
Restriction site analysis
Dynamic restriction sites
Sort enzymes
Manage enzymes
Show enzymes cutting inside/outside selection
Show enzymes with compatible ends
Restriction site analysis from the Toolbox
Selecting, sorting and filtering enzymes
Number of cut sites
Output of restriction map analysis
Restriction sites as annotation on the sequence
Table of restriction sites
Table of restriction fragments
Gel
Gel electrophoresis
Separate fragments of sequences on gel
Separate sequences on gel
Gel view
Restriction enzyme lists
Create enzyme list
View and modify enzyme list
Sequence alignment
Create an alignment
Gap costs
Fast or accurate alignment algorithm
Aligning alignments
Fixpoints
View alignments
Bioinformatics explained: Sequence logo
Calculation of sequence logos
Edit alignments
Move residues and gaps
Insert gaps
Delete residues and gaps
Copy annotations to other sequences
Move sequences up and down
Delete, rename and add sequences
Realign selection
Join alignments
How alignments are joined
Pairwise comparison
Pairwise comparison on alignment selection
Pairwise comparison parameters
The pairwise comparison table
Bioinformatics explained: Multiple alignments
Use of multiple alignments
Constructing multiple alignments
Phylogenetic trees
Inferring phylogenetic trees
Phylogenetic tree parameters
Tree View Preferences
Bioinformatics explained: phylogenetics
The phylogenetic tree
Modern usage of phylogenies
Reconstructing phylogenies from molecular data
Interpreting phylogenies
RNA structure
RNA secondary structure prediction
Selecting sequences for prediction
Structure output
Partition function
Advanced options
Structure as annotation
View and edit secondary structures
Graphical view and editing of secondary structure
Tabular view of structures and energy contributions
Symbolic representation in sequence view
Probability-based coloring
Evaluate structure hypothesis
Selecting sequences for evaluation
Probabilities
Structure Scanning Plot
Selecting sequences for scanning
The structure scanning result
Bioinformatics explained: RNA structure prediction by minimum free energy minimization
The algorithm
Structure elements and their energy contribution
Trimming, multiplexing and sequencing quality control
Trimming
Quality trimming
Adapter trimming
Length trimming
Trim output
Multiplexing
Sort sequences by name
Process tagged sequences
Sequencing data quality control
Report contents
Running the quality control tool
Merge overlapping pairs
Using quality scores when merging
Report of merged pairs
Tracks
Track lists
Zooming and navigating track views
Adding, removing and reordering tracks
Showing a track in a table
Finding annotations on the genome
Retrieving reference data tracks
Merging tracks
Converting data to tracks and back
Convert to tracks
Convert from tracks
Annotate and filter tracks
Annotate from overlapping annotations
Filter annotations on name
Filter against overlapping annotations
Creating graph tracks
Read mapping
The read mapper tool
Selecting reads and reference
Including or excluding regions (masking)
Mapping parameters
Mapping output
Mapping reports
Detailed mapping report
Summary mapping report
Color space
Sequencing
Error modes
Mapping in color space
Viewing color space information
Mapping result
Mapping table
View settings in the Side Panel
Output from the mapping
Extract parts of a mapping
Find broken pair mates
Merge mapping results
Extract consensus sequence
Resequencing
Target regions statistics
Running the target regions statistics
Coverage summary report
Per-region statistics
Coverage table
Quality-based variant detection
Assessing the quality of the neighborhood bases
Significance of variant
Ploidy and genetic code
Reporting the variants
Probabilistic variant detection
Calculation of the prior and error probabilities
Calculation of the likelihood
Calculation of the posterior probability for each site type at each position in the genome
Comparison with the reference sequence and identification of candidate variants
Posterior filtering and reporting of variants
Running the variant detection
Setting ploidy and genetic code
Reporting the variants found
Variant data
Variant tracks
Annotated variant table
Linking adjacent variants in linkage groups
Variant types
Special notes upgrading to Genomics Workbench 6.0
Detailed information about overlapping paired reads
Filtering and annotating variants
Filter against known variants
Annotating from known variants
Exon number annotation
Annotate with flanking sequence
Filter marginal variants
Filter reference variants
Comparing variants
Compare variants within group
Fisher exact test
Trio analysis
Filter against control reads
Predicting functional consequences
Amino acid changes
Splice site effect prediction
GO enrichment analysis
Conservation score annotation
Transcriptomics
RNA-Seq analysis
Defining reference genome and mapping settings
Exon identification and discovery
RNA-Seq output options
Interpreting the RNA-Seq analysis result
Expression profiling by tags
Extract and count tags
Create virtual tag list
Annotate tag experiment
Small RNA analysis
Extract and count
Downloading miRBase
Annotating and merging small RNA samples
Working with the small RNA sample
Exploring novel miRNAs
Experimental design
Supported array platforms
Setting up an experiment
Organization of the experiment table
Adding annotations to an experiment
Scatter plot view of an experiment
Cross-view selections
Transformation and normalization
Selecting transformed and normalized values for analysis
Transformation
Normalization
Quality control
Creating box plots - analyzing distributions
Hierarchical clustering of samples
Principal component analysis
Statistical analysis - identifying differential expression
Gaussian-based tests
Tests on proportions
Corrected p-values
Volcano plots - inspecting the result of the statistical analysis
Feature clustering
Hierarchical clustering of features
K-means/medoids clustering
Annotation tests
Hypergeometric tests on annotations
Gene set enrichment analysis
General plots
Histogram
MA plot
Scatter plot
De novo sequencing
De novo assembly
How it works
Resolve repeats using reads
Automatic paired distance estimation
Optimization of the graph using paired reads
Bubble resolution
Converting the graph to contig sequences
Summary
Randomness in the results
SOLiD data support in de novo assembly
De novo assembly parameters
De novo assembly report
Epigenomics
ChIP sequencing
Peak finding and false discovery rates
Peak refinement
Reporting the results
Appendix
Comparison of workbenches
Use of multi-core computers
Graph preferences
Working with tables
Filtering tables
BLAST databases
Peptide sequence databases
Nucleotide sequence databases
Adding more databases
Proteolytic cleavage enzymes
Restriction enzymes database configuration
Technical information about modifying Gateway cloning sites
IUPAC codes for amino acids
IUPAC codes for nucleotides
Formats for import and export
List of bioinformatic data formats
List of graphics data formats
SAM/BAM export format specification
SAM Specification
SAM Header Section
SAM Alignment Section
Flags
Optional fields in the alignment section
Microarray data formats
GEO (Gene Expression Omnibus)
Affymetrix GeneChip
Illumina BeadChip
Gene ontology annotation files
Generic expression and annotation data file formats
Translation Tables
1. Standard
2. Vertebrate Mitochondrial
3. Yeast Mitochondrial
4. Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma
5. Invertebrate Mitochondrial
6. Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear
9. Echinoderm Mitochondrial; Flatworm Mitochondrial
10. Euplotid Nuclear
11. Bacterial and Plant Plastid
12. Alternative Yeast Nuclear
13. Ascidian Mitochondrial
14. Alternative Flatworm Mitochondrial
15. Blepharisma Macronuclear
16. Chlorophycean Mitochondrial
17. Trematode Mitochondrial
18. Scenedesmus obliquus mitochondrial
19. Thraustochytrium mitochondrial code
Custom codon frequency tables
Matrices for alignment calculation
PAM30 log-odds matrix
PAM60 log-odds matrix
BLOSUM42 log-odds matrix
BLOSUM62 log-odds matrix
BLOSUM80 log-odds matrix
Bibliography
SAM Specification
The workbench aims to import and export SAM and BAM files according to the v1.4-r962 version of the SAM specification. This appendix describes how the workbench exports SAM and BAM files along with known limitations.