The CLC Assembly Cell programs assume the single file form for paired data as the default. For paired data with separate files for first and second members of the pair, both files need to be included as input, with each of these files being preceeded by the `-i' option (for interleave). The order of the files on the command line matters. The first file should contain the first member of the pair. The second file should contain the second member of the pair.
To further illustrate this, consider a situation where we have two fasta files like this (first.fasta):
>pair_1/1 ACTGTCTAGCTACTGCATTGACTGCGAC >pair_2/1 TAGCGACGATGCTACTACTCTACTCGAC >pair_3/1 GATCTCTAGGACTACGCTACGAGCCTCA
and this (second.fasta):
>pair_1/2 GGATCATCTACGTCATCGACTAGTACAC >pair_2/2 AAGCGACACCTACTCATCGATCATCAGA >pair_3/2 TATCGACTCAGACACTCTATACTACCAT
where _1/1 and _1/2 belong together, pair_2/1 and _2/2 belong together, etc. The programs expect to see these sequences as one fasta file like this (joint.fasta):
>pair_1/1 ACTGTCTAGCTACTGCATTGACTGCGAC >pair_1/2 GGATCATCTACGTCATCGACTAGTACAC >pair_2/1 TAGCGACGATGCTACTACTCTACTCGAC >pair_2/2 AAGCGACACCTACTCATCGATCATCAGA >pair_3/1 GATCTCTAGGACTACGCTACGAGCCTCA >pair_3/2 TATCGACTCAGACACTCTATACTACCAT
This is accomplished using the `-i' option like this:
clc_mapper -o assembly.cas -d human.gb -q -p fb ss 180 250 -i first.fasta second.fasta
This is identical to:
clc_mapper -o assembly.cas -d human.gb -q -p fb ss 180 250 joint.fasta
Note that the `-i' option has to immediately proceed the input files.