Browse the manual

Introduction to CLC Main Workbench
- Contact information
- Download and installation
- System requirements
- Workbench Licenses
- When the program is installed: Getting started
  - Quick start
- Plugins
- Network configuration
- Latest improvements
User interface
- View Area
- Zoom and selection in View Area
- Toolbox and Status Bar
- Workspace
- List of shortcuts
Data management and search
- Navigation Area
- Metadata
- Working with tables
  - Filtering tables
- Customized attributes on data locations
- Local search
  - Quick search
  - Advanced search
User preferences and settings
- General preferences
- View preferences
  - Import and export Side Panel settings
- Data preferences
- Advanced preferences
- Export/import of preferences
- View settings for the Side Panel
Printing
- Selecting which part of the view to print
- Page setup
- Print preview
Import/export of data and graphics
- Standard import
  - External files
- Data export
- Export graphics to files
  - File formats
- Export graph data points to a file
- Copy/paste view output
Data download
- Search for Sequences at NCBI
  - NCBI search options
  - Handling of NCBI search results
- Search for structures at NCBI
- UniProt (Swiss-Prot/TrEMBL) search
- Sequence web info
Running tools, handling results and batching
- Running tools
- Handling results
- Batch processing
Workflows
- Creating a workflow
- Distributing and installing workflows
- Executing a workflow
- Open copy of installed workflow
Other data types
- Tracks
Viewing and editing sequences
- View sequence
- Circular DNA
  - Using split views to see details of the circular molecule
  - Mark molecule as circular and specify starting point
- Working with annotations
- Element information
- View as text
- Sequence Lists
3D Molecule Viewer
- Importing molecule structure files
- Viewing molecular structures in 3D
  - Updating old structure files
- Customizing the visualization
  - Visualization styles and colors
  - Project settings
- Tools for linking sequence and structure
- Protein structure alignment
Sequence alignment
- Create an alignment
- View alignments
  - Bioinformatics explained: Sequence logo
- Edit alignments
  - Realignment
- Join alignments
- Pairwise comparison
  - The pairwise comparison table
  - Bioinformatics explained: Multiple alignments
Phylogenetic trees
- K-mer Based Tree Construction
- Create tree
- Model Testing
- Maximum Likelihood Phylogeny
  - Bioinformatics explained
- Tree Settings
- Metadata and phylogenetic trees
General sequence analyses
- Extract Annotations
- Extract sequences
- Shuffle sequence
- Dot plots
- Local complexity plot
- Sequence statistics
  - Bioinformatics explained: Protein statistics
- Join sequences
- Pattern discovery
  - Pattern discovery search parameters
  - Pattern search output
- Motif Search
- Create motif list
Nucleotide analyses
- Convert DNA to RNA
- Convert RNA to DNA
- Reverse complements of sequences
- Reverse sequence
- Translation of DNA or RNA to protein
- Find open reading frames
  - Open reading frame parameters
Protein analyses
- Protein charge
- Antigenicity
- Hydrophobicity
  - Hydrophobicity graphs along sequence
  - Bioinformatics explained: Protein hydrophobicity
- Pfam domain search
  - Download of Pfam database
  - Running Pfam Domain Search
- Secondary structure prediction
- Protein report
- Reverse translation from protein into DNA
  - Bioinformatics explained: Reverse translation
- Proteolytic cleavage detection
  - Bioinformatics explained: Proteolytic cleavage
Sequencing data analyses and Assembly
- Importing and viewing trace data
  - Trace settings in the Side Panel
- Trim sequences
  - Trimming using the Trim tool
  - Manual trimming
- Assemble sequences
- Assemble sequences to reference
- Sort sequences by name
- Add sequences to an existing contig
- View and edit contigs and read mappings
- Reassemble contig
- Secondary peak calling
Primers and probes
- Primer design - an introduction
  - General concept
  - Scoring primers
- Setting parameters for primers and probes
  - Primer Parameters
- Graphical display of primer information
  - Compact information mode
  - Detailed information mode
- Output from primer design
- Standard PCR
- Nested PCR
- TaqMan
- Sequencing primers
- Alignment-based primer and probe design
- Analyze primer properties
- Find binding sites and create fragments
  - Binding parameters
  - Results - binding sites and fragments
- Order primers
Cloning and restriction sites
- Restriction site analyses
  - Dynamic restriction sites
  - Restriction Site Analysis
- Restriction enzyme lists
- Molecular cloning
- Gateway cloning
- Gel electrophoresis
  - Gel view
RNA structure
- RNA secondary structure prediction
- View and edit secondary structures
- Evaluate structure hypothesis
  - Selecting sequences for evaluation
  - Probabilities
- Structure scanning plot
  - Selecting sequences for scanning
  - The structure scanning result
- Bioinformatics explained: RNA structure prediction by minimum free energy minimization
  - The algorithm
  - Structure elements and their energy contribution
Expression analysis
- Experimental design
- Working with tracks and experiments
  - Data structures for transcriptomics
- Transformation and normalization
- Quality control
- Statistical analysis - identifying differential expression
- Feature clustering
  - Hierarchical clustering of features
  - K-means/medoids clustering
- Annotation tests
  - Hypergeometric tests on annotations
  - Gene set enrichment analysis
- General plots
BLAST search
- Running BLAST searches
  - BLAST at NCBI
  - BLAST against local data
- Output from BLAST searches
- Extract consensus sequence
- Local BLAST databases
- Manage BLAST databases
- Bioinformatics explained: BLAST
Appendix
- Graph preferences
- BLAST databases
- Proteolytic cleavage enzymes
- Restriction enzymes database configuration
- Technical information about modifying Gateway cloning sites
- IUPAC codes for amino acids
- IUPAC codes for nucleotides
- Formats for import and export
  - List of bioinformatic data formats
  - List of graphics data formats
- Gene expression annotation files and microarray data formats
- Translation Tables
- Custom codon frequency tables
- Matrices for alignment calculation
Bibliography

Different scoring matrices

PAM
The first PAM matrix (Point Accepted Mutation) was published in 1978 by Dayhoff et al. The PAM matrix was build through a global alignment of related sequences all having sequence similarity above 85% [Dayhoff and Schwartz, 1978]. A PAM matrix shows the probability that any given amino acid will mutate into another in a given time interval. As an example, PAM1 gives that one amino acid out of a 100 will mutate in a given time interval. In the other end of the scale, a PAM256 matrix, gives the probability of 256 mutations in a 100 amino acids (see figure 15.15).

There are some limitation to the PAM matrices which makes the BLOSUM matrices somewhat more attractive. The dataset on which the initial PAM matrices were build is very old by now, and the PAM matrices assume that all amino acids mutate at the same rate - this is not a correct assumption.

BLOSUM
In 1992, 14 years after the PAM matrices were published, the BLOSUM matrices (BLOcks SUbstitution Matrix) were developed and published [Henikoff and Henikoff, 1992].

Henikoff et al. wanted to model more divergent proteins, thus they used locally aligned sequences where none of the aligned sequences share less than 62% identity. This resulted in a scoring matrixï¿12called BLOSUM62. In contrast to the PAM matrices the BLOSUM matrices are calculated from alignments without gaps emerging from the BLOCKS database http://blocks.fhcrc.org/.

Sean Eddy recently wrote a paper reviewing the BLOSUM62 substitution matrix and how to calculate the scores [Eddy, 2004].