Read mapping formats

File type Suffix Import Export Description
ACE .ace X X No chromatogram or quality score
AGP .agp/.fa   X Exports scaffolded contigs (see below)
BAM .bam X X Compressed version of SAM. See SAM import
CLC .clc X X Rich format including all information
CLC Assembly File .cas X   Output from the CLC Assembly Cell
SAM .sam X X Sequence Alignment/Map. See Sequence assembly files
Mapping coverage .tsv   X Detailed per-base info on coverage (see below)
Special note about AGP format Both sequence lists and contigs with reads mapped can be used. Based on annotations of type Scaffold (which are automatically added when running the de novo assembly with the scaffold option), the contigs are broken up before exported as fasta. The agp file produced holds information about how the contigs relate to each other.

Export of coverage information from sequence alignments Coverage information from read mappings can be exported in a tabular format using the Mapping coverage export. The output contains information on the number of nucleotides aligned to positions in reference sequences. Insertions are also reported as described below while deletions are reported as reference regions without read coverage. Both stand-alone read mappings and read tracks can be used as input.

The exported file contains the following columns:

Column Description
1 Reference name
2 Reference position
3 Reference sub-position (insertion)
4 Reference symbol
5 Number of A's
6 Number of C's
7 Number of G's
8 Number of T's
9 Number of N's
10 Number of Gaps
11 Total number of reads covering the position

The Reference sub-position column is empty (indicated by a - symbol) when the reference is defined at a given position. In case of an insertion this column contains an index into the insertion (a number between 1 and the length of the insertion) while the Reference symbol column is empty and the Reference position column contains the position of the last reference.