- Import and export of SAM/BAM /CRAM format files
- AGP export
- Export of coverage information from read mappings
Read mapping formats
Import and export of SAM/BAM /CRAM format files
Import of SAM, BAM and CRAM format files is described in Importing SAM, BAM and CRAM mapping files.
The format specification when exporting to SAM, BAM or CRAM format is described in SAM/BAM/CRAM export format specification. Index files can be created as part of BAM and CRAM exports.
AGP export
Sequence lists and read mappings generated by de novo assembly can be exported using the AGP exporter. On export, contigs are split up based on annotations of type Scaffold. These annotations are added when the "Perform scaffolding" option is enabled when assembling paired reads. Contig sequences are exported to a single FASTA format file, with the accompanying AGP format file containing information about how the contigs relate to one another.
AGP export is described further in AGP export.
Export of coverage information from read mappings
Coverage information from read mappings can be exported in a tabular format using Mapping Coverage export. The output contains information on the number of nucleotides aligned to positions of reference sequences. Insertions are also reported, as described below while deletions are reported as reference regions without read coverage. Both stand-alone read mappings and reads tracks can be used as input.
The exported file contains the following columns by default:
Column | Description |
1 | Reference name |
2 | Reference position |
3 | Reference sub-position (insertion) |
4 | Reference symbol |
5 | Number of As |
6 | Number of Cs |
7 | Number of Gs |
8 | Number of Ts |
9 | Number of Ns |
10 | Number of Gaps |
11 | Total number of reads covering the position |
The Reference sub-position column is empty (indicated by a - symbol) when the reference is defined at a given position. In case of an insertion this column contains an index into the insertion (a number between 1 and the length of the insertion) while the Reference symbol column is empty and the Reference position column contains the position of the last reference.
See Export of tables for detailed information about exporting tabular data from the CLC Genomics Workbench.