Read mapping formats
File type | Suffix | Import | Export | Description |
---|---|---|---|---|
ACE | .ace | X | X | No chromatogram or quality score |
AGP | .agp/.fa | X | Exports scaffolded contigs (see below) | |
BAM | .bam | X | X | Compressed version of SAM. See SAM import |
CLC | .clc | X | X | Rich format including all information |
CLC Assembly File | .cas | X | Output from the CLC Assembly Cell | |
SAM | .sam | X | X | Sequence Alignment/Map. See Sequence assembly files |
Mapping coverage | .tsv | X | Detailed per-base info on coverage (see below) |
Note about BAM export Index files can be created as part of BAM exports.
Note about AGP format Both sequence lists and contigs with reads mapped can be used. Based on annotations of type Scaffold (which are automatically added when running the de novo assembly with the scaffold option), the contigs are broken up before exported as fasta. The agp file produced holds information about how the contigs relate to each other.
Export of coverage information from sequence alignments Coverage information from read mappings can be exported in a tabular format using the Mapping coverage export. The output contains information on the number of nucleotides aligned to positions in reference sequences. Insertions are also reported as described below while deletions are reported as reference regions without read coverage. Both stand-alone read mappings and reads tracks can be used as input.
The exported file contains the following columns:
Column | Description |
1 | Reference name |
2 | Reference position |
3 | Reference sub-position (insertion) |
4 | Reference symbol |
5 | Number of A's |
6 | Number of C's |
7 | Number of G's |
8 | Number of T's |
9 | Number of N's |
10 | Number of Gaps |
11 | Total number of reads covering the position |
The Reference sub-position column is empty (indicated by a - symbol) when the reference is defined at a given position. In case of an insertion this column contains an index into the insertion (a number between 1 and the length of the insertion) while the Reference symbol column is empty and the Reference position column contains the position of the last reference.