Sequence data formats

Sequence data formats that can be imported using Standard Import are listed in the table below.

Note: There are dedicated import tools for sequence data types not listed in the table below, such as high-throughput sequencing data from Illumina, IonTorrent and other vendors, as well as high-throughput fasta and trace files. See Import high-throughput sequencing data.

File type Suffix Import Export Description
AB1 .ab1 X   Including chromatograms
ABI .abi X   Including chromatograms
CLC .clc X X Rich format including all information
Clone manager .cm5 X   Clone manager sequence format
DNAstrider .str/.strider X X
DS Gene .bsml X  
EMBL .emb/.embl X X Rich information incl. annotations (nucs only)
FASTA .fa/.fsa/.fasta X X Simple format, name & description
FASTQ .fastq X X Simple format, name & description
GenBank .gbk/.gb/.gp/.gbff X X Rich information incl. annotations
Gene Construction Kit .gcc X  
Lasergene .pro/.seq X  
Nexus .nxs/.nexus X X
Phred .phd X   Including chromatograms
PIR (NBRF) .pir X X Simple format, name & description
Raw sequence any X   Only sequence (no name)
SCF2 .scf X   Including chromatograms
SCF3 .scf X X Including chromatograms
Sequence Comma separated values .csv X X Simple format. One seq per line: name, description(optional), sequence
Staden .sdn X  
Swiss-Prot .swp X X Rich information incl. annotations (only peptides)
Tab delimited text .txt   X Annotations in tab delimited text format

Additional notes about working with trace data