Sequence data formats
Sequence data formats that can be imported using Standard Import are listed in the table below.
Note: There are dedicated import tools for sequence data types not listed in the table below, such as high-throughput sequencing data from Illumina, IonTorrent and other vendors, as well as high-throughput fasta and trace files. See Import high-throughput sequencing data.
File type |
Suffix |
Import |
Export |
Description |
AB1 |
.ab1 |
X |
|
Including chromatograms |
ABI |
.abi |
X |
|
Including chromatograms |
CLC |
.clc |
X |
X |
Rich format including all information |
Clone manager |
.cm5 |
X |
|
Clone manager sequence format |
DNAstrider |
.str/.strider |
X |
X |
|
DS Gene |
.bsml |
X |
|
|
EMBL |
.emb/.embl |
X |
X |
Rich information incl. annotations (nucs only) |
FASTA |
.fa/.fsa/.fasta |
X |
X |
Simple format, name & description |
FASTQ |
.fastq |
X |
X |
Simple format, name & description |
GenBank |
.gbk/.gb/.gp/.gbff |
X |
X |
Rich information incl. annotations |
Gene Construction Kit |
.gcc |
X |
|
|
Lasergene |
.pro/.seq |
X |
|
|
Nexus |
.nxs/.nexus |
X |
X |
|
Phred |
.phd |
X |
|
Including chromatograms |
PIR (NBRF) |
.pir |
X |
X |
Simple format, name & description |
Raw sequence |
any |
X |
|
Only sequence (no name) |
SCF2 |
.scf |
X |
|
Including chromatograms |
SCF3 |
.scf |
X |
X |
Including chromatograms |
Sequence Comma separated values |
.csv |
X |
X |
Simple format. One seq per line: name, description(optional), sequence |
Staden |
.sdn |
X |
|
|
Swiss-Prot |
.swp |
X |
X |
Rich information incl. annotations (only peptides) |
Tab delimited text |
.txt |
|
X |
Annotations in tab delimited text format |
Additional notes about working with trace data
- When importing trace data, the called bases in the file are imported and the chromatogram information associated with the called bases is imported. If the base calls within the file have already been trimmed, the part of the chromatogram not associated with base calls will not be imported.
- The Trim Sequences tool, described in Trim sequences, adds annotations to trimmed regions. When exporting to fasta format, there is an option to remove sequence ends covered by Trim annotations.