Appendix A - Sequence file naming pattern
RNA-seq Analysis Portalsupports analysis of:- FASTQ files generated by Illumina, Element Biosciences, and MGI sequencers
- BAM files from Thermo Fisher Ion Torrent sequencers
To ensure that FASTQ files are correctly grouped into samples during upload, file names must follow one of the supported naming conventions described below.
If you have FASTQ files from a Thermo Fisher Ion Torrent sequencer, you can manually rename them to match the supported FASTQ naming patterns.
File name requirements
The following rules apply only to the <sample-name> part of the file name.Other parts of the file name (such as lane number, read number, or file extension) are not affected.
For the <sample-name> part of the file name:
- Special characters are not supported, including:
", #, %, <, >, ?, [, ], \, ^, {, }, |. - Use of underscores (_) is not recommended.
- Hyphens (-) are allowed.
Illumina, and Element Biosciences (legacy naming)
Naming convention:
<sample-name>_S<sample_number>_L<lane-number (zero-padded to 3 digits)>_R<read-number>_001.fastq(.gz)
Single-lane, paired-end example
Two FASTQ files grouped into one sample (TC-35-A_S1):
TC-35-A_S1_L001_R1_001.fastq.gz TC-35-A_S1_L001_R2_001.fastq.gz
Four-lane, paired-end data
Eight FASTQ files grouped into one sample (501-708-SEQC-77-1_S1):
501-708-SEQC-77-1_S1_L001_R1_001.fastq.gz 501-708-SEQC-77-1_S1_L001_R2_001.fastq.gz 501-708-SEQC-77-1_S1_L002_R1_001.fastq.gz 501-708-SEQC-77-1_S1_L002_R2_001.fastq.gz 501-708-SEQC-77-1_S1_L003_R1_001.fastq.gz 501-708-SEQC-77-1_S1_L003_R2_001.fastq.gz 501-708-SEQC-77-1_S1_L004_R1_001.fastq.gz 501-708-SEQC-77-1_S1_L004_R2_001.fastq.gz
Single-lane, single-read data
Each FASTQ file is imported as a separate sample:
QL4_S4_L001_R1_001.fastq QL6_S6_L001_R1_001.fastq
Imported samples: QL4_S4 and QL6_S6
Element Biosciences (default naming)
Naming convention:
<sample-name>_R<read-number>.fastq(.gz)
File names may be with or without "R".
Single-lane, paired-end data
Two FASTQ files grouped into one sample (Sample28):
Sample28_R1.fastq.gz Sample28_R2.fastq.gz
MGI
Naming convention:<sample-name>_L<lane-number>_<barcode ID>_R<read-number>_001.fastq(.gz)
File names may be with or without "R".
Single-lane, paired-end data
Two FASTQ files grouped into one sample (Sample49_11):
Sample49_L01_11_R1_001.fastq.gz Sample49_L01_11_R2_001.fastq.gz
Four-lane, paired-end data
Four FASTQ files grouped into one sample (Sample50_12):
Sample50_L01_12_R1_001.fastq.gz Sample50_L01_12_R2_001.fastq.gz Sample50_L02_12_R1_001.fastq.gz Sample50_L02_12_R2_001.fastq.gz
