Browse the manual

Introduction to CLC Cancer Research Workbench
- Contact information
- System requirements
  - Limitations on maximum number of cores
- Licenses
- About CLC Workbenches
  - New program feature request
  - Getting help
- When the program is installed: Getting started
  - Import of example data
- Plugins
- Network configuration
User interface
- View Area
- Zoom and selection in View Area
- Toolbox and Status Bar
- Workspace
- List of shortcuts
Data organization and management
- Navigation Area
- Customized attributes on data locations
- Filling in values
- Sequence web info
User preferences and settings
- General preferences
- Default view preferences
  - Number formatting in tables
  - Import and export Side Panel settings
- Data preferences
- Advanced preferences
  - Default data location
- Export/import of preferences
  - The different options for export and import
- View settings for the Side Panel
  - Saving, removing and applying saved settings
Printing
- Selecting which part of the view to print
- Page setup
  - Header and footer
- Print preview
Import/export of data and graphics
- Standard import
- Import tracks
- Import high-throughput sequencing data
- Import Primer Pairs
- Data export
- Export graphics to files
- Export graph data points to a file
- Copy/paste view output
History log
- Element history
  - Sharing data with history
Batching and result handling
- Batch processing
- How to handle results of analyses
  - Table outputs
  - Batch log
- Working with tables
  - Filtering tables
Viewing and editing sequences
- View sequence
- Circular DNA
  - Using split views to see details of the circular molecule
  - Mark molecule as circular and specify starting point
- Working with annotations
- Element information
- View as text
- Sequence Lists
Getting started
- Reference data
- Create new folder
- Import data
  - How to import data
Preparing Raw Data
- Prepare sequencing data - all application types
- Analysis of sequencing data
Whole genome sequencing (WGS)
- Automatic analysis of sequencing data (WGS)
- Identify Variants (WGS)
  - How to run the 'Identify Variants' ready-to-use workflow
  - Output from the Identify Variants workflow
- Annotate Variants (WGS)
- Filter Somatic Variants (WGS)
- Identify Somatic Variants from Tumor Normal Pair (WGS)
- Identify Known Variants in One Sample (WGS)
Whole exome sequencing (WES)
- Automatic analysis of sequencing data (WES)
- Identify Variants (WES)
- Annotate Variants (WES)
- Filter Somatic Variants (WES)
- Identify Somatic Variants from Tumor Normal Pair (WES)
  - Import your targeted regions
  - How to run the 'Identify Somatic Variants from Tumor Normal Pair' ready-to-use workflow
- Identify Known Variants in One Sample (WES)
- Identify and Annotate Variants (WES)
Targeted amplicon sequencing (TAS)
- Automatic analysis of sequencing data (TAS)
- Identify Variants (TAS)
- Annotate Variants (TAS)
- Filter Somatic Variants (TAS)
- Identify Somatic Variants from Tumor Normal Pair (TAS)
  - Import your targeted regions
  - How to run the 'Identify Somatic Variants from Tumor Normal Pair' ready-to-use workflow
- Identify Known Variants in One Sample (TAS)
- Identify and Annotate Variants (TAS)
Whole Transcriptome Sequencing (WTS)
- Automatic analysis of RNA-seq data
- Analysis of multiple samples
- Annotate Variants (WTS)
- Compare variants in DNA and RNA
- Identify Candidate Variants and Genes from Tumor Normal Pair
- Identify variants and add expression values
- Identify and Annotate Differentially Expressed Genes and Pathways
Using data from other workbenches
- Open outputs from other workbenches
Genome browser tools
- Create new genome browser view
- Genome browser view
- Creating graph tracks
Quality control tools
- QC for Target Sequencing
- QC for Sequencing Reads
- QC for Read Mapping
  - Running the 'QC for Read Mapping' tool
  - Summary mapping report
Preparing raw data tools
- Merge overlapping pairs
  - Using quality scores when merging
  - Report of merged pairs
- Trim Sequences
- Demultiplex reads
Resequencing analysis tools
- Identify Known Mutations from Sample Mappings
- Trim primers of mapped reads
- Map Reads to Reference
- Mapping output options
- Color space
- Mapping result
  - View settings in the Side Panel
- Local realignment
- Remove duplicate mapped reads
  - Algorithm details and parameters
  - Running the duplicate reads removal
- Coverage analysis
  - Running the Coverage analysis tool
- Variant Detectors - overview
  - Differences among the variants called by the three variant callers
  - How the variant detectors work
- Basic Variant Detection
- Fixed Ploidy Variant Detection
  - Ploidy and sensitivity
- Low Frequency Variant Detection
- Variant Detectors - error model estimation
- Variant Detectors - filters
  - General filters
  - Noise filters
- Variant Detectors - the outputs
- InDels and Structural Variants
- Variant data
- Detailed information about overlapping paired reads
Add information to variants tools
- Add information from variant databases
- Add conservation scores
- Add exon number
- Add flanking sequence
- Add fold changes
- Add information about amino acid changes
- Add information from overlapping genes
- Add information from genomic regions
- From databases
Remove variants tools
- Remove variants found in external database
- Remove variants not found in external database
- Remove false positives
- Remove Germline Variants
- Remove reference variants
- Remove variants inside genome regions
- Remove variants outside genome regions
- Remove variants outside targeted regions
- From databases
Add information to genes tool
- Add information from overlapping variants
Compare samples tools
- Compare shared variants within a group of samples
- Identify Enriched Variants in Case vs Control Group
- Trio analysis
Identify candidate variants tools
- Create Filter Criteria
- Identify candidate variants
- Remove information from variants
- Identify variants with effect on splicing
Identify candidate genes tools
- Identify differentially expressed gene groups and pathways
- Identify highly mutated gene groups and pathways
- Identify mutated genes
- Select genes by name
Transcriptomics tools
- RNA-Seq analysis
- Small RNA analysis
- Experimental design
- Working with tracks and experiments
- Transformation and normalization
- Quality control
- Statistical analysis - identifying differential expression
  - Empirical analysis of DGE
  - Volcano plots - inspecting the result of the statistical analysis
- Feature clustering
  - Hierarchical clustering of features
  - K-means/medoids clustering
- Annotation tests
  - Hypergeometric tests on annotations
  - Gene set enrichment analysis
- General plots
Helper tools
- Extract sequences
Sequencing Data Analysis
- Importing and viewing trace data
  - Scaling traces
  - Trace settings in the Side Panel
- Trim sequences
  - Trimming using the Trim tool
  - Manual trimming
- Assemble sequences
- Sort Sequences By Name
- Assemble sequences to reference
- Add sequences to an existing contig
- View and edit read mappings
- Reassemble contig
- Secondary peak calling
Primers
- Primer design - an introduction
  - General concept
  - Scoring primers
- Setting parameters for primers and probes
  - Primer Parameters
- Graphical display of primer information
  - Compact information mode
  - Detailed information mode
- Output from primer design
- Standard PCR
  - User input
  - Standard PCR output table
- Nested PCR
  - Nested PCR output table
- TaqMan
  - TaqMan output table
- Sequencing primers
  - Sequencing primers output table
- Alignment-based primer and probe design
- Analyze primer properties
- Find binding sites and create fragments
  - Binding parameters
  - Results - binding sites and fragments
- Order primers
Workflows
- Creating a workflow
- Distributing and installing workflows
- Executing a workflow
Legacy tools
- Quality-based variant detection
- Probabilistic variant detection
Appendix
- Use of multi-core computers
- Reference data overview
- IUPAC codes for amino acids
- IUPAC codes for nucleotides
- Formats for import and export
  - List of bioinformatic data formats
  - List of graphics data formats
- SAM/BAM export format specification
  - Flags
- Gene expression annotation files and microarray data formats
- Translation Tables
- Matrices for alignment calculation
Bibliography

Generic expression data table format

The CLC Cancer Research Workbench will import a tab, semicolon or comma-separated .txt or .csv file as expression array samples if the following requirements are met:

the first non-empty line of the file contains text. All entries, except the first, will be used as sample names
the following (non-empty) lines contain the same number of entries as the first non-empty line. The requirements to these are that the first entry should be a string (this will be used as the feature ID) and the remaining entries should contain numbers (which will be used as expression values -- one per sample). Empty entries are not allowed, but NaN values are allowed.
the file contains at least two samples.

An example of this format is shown below:

FeatureID;sample1;sample2;sample3
gene1;200;300;23
gene2;210;30;238
gene3;230;50;23
gene4;50;100;235
gene5;200;300;23
gene6;210;30;238
gene7;230;50;23
gene8;50;100;235

This will be imported as three samples with eight genes in each sample.

Download a this example as a file here:
http://www.clcbio.com/madata/CustomExpressionData.txt