Sequence data formats

Note that high-throughput sequencing data formats from Illumina, SOLiD, IonTorrent, 454 and also high-throughput fasta and trace files are imported using a special import as described in Import high-throughput sequencing data. These data can also be exported in fastq format (using NCBI/Sanger Phred quality scores).

File type Suffix Import Export Description
AB1 .ab1 X   Including chromatograms
ABI .abi X   Including chromatograms
CLC .clc X X Rich format including all information
Clone Manager .cm5 X  
CSV export .csv   X Annotations in csv format
DNAstrider .str/.strider X X
DS Gene .bsml X  
Embl .embl X X Rich information incl. annotations (nucs only)
FASTA .fsa/.fasta X X Simple format, name & description
GCG sequence .gcg X   Rich information incl. annotations
GenBank .gbk/.gb/.gp X X Rich information incl. annotations
Gene Construction Kit .gck X  
Lasergene .pro/.seq X  
Nexus .nxs/.nexus X X
Phred .phd X   Including chromatograms
PIR (NBRF) .pir X   Simple format, name & description
Raw sequence any X   Only sequence (no name)
SCF2 .scf X   Including chromatograms
SCF3 .scf X X Including chromatograms
Sequence Comma separated values .csv X X Simple format. One seq per line: name, description(optional), sequence
Staden .sdn X  
Swiss-Prot .swp X X Rich information incl. annotations (only peptides)
Tab delimited text .txt   X Annotations in tab delimited text format
Vector NTI archives .ma4/.pa4/.oa4 X   Archives in rich format
Vector NTI Database   X   Special import full database
Zip export .zip   X Selected files in CLC format
Zip import .zip/.gzip/.tar X   Contained files/folder structure