Generic expression data table format

The CLC Genomics Workbench will import a tab, semicolon or comma-separated .txt or .csv file as expression array samples if the following requirements are met:

  1. the first non-empty line of the file contains text. All entries, except the first, will be used as sample names
  2. the following (non-empty) lines contain the same number of entries as the first non-empty line. The requirements to these are that the first entry should be a string (this will be used as the feature ID) and the remaining entries should contain numbers (which will be used as expression values -- one per sample). Empty entries are not allowed, but NaN values are allowed.
  3. the file contains at least two samples.
An example of this format is shown below:
FeatureID;sample1;sample2;sample3
gene1;200;300;23
gene2;210;30;238
gene3;230;50;23
gene4;50;100;235
gene5;200;300;23
gene6;210;30;238
gene7;230;50;23
gene8;50;100;235
This will be imported as three samples with eight genes in each sample.

Download a this example as a file here:
http://www.clcbio.com/madata/CustomExpressionData.txt