Import Expression Data

Import Expression Data enables import of individual expression tracks from an expression data matrix. The data matrix needs to conform to the following formatting:

Image count_matrix_rpkm
Figure 3.1: RPKM count matrix using Ensembl gene names and representing 4 samples in a Tumor Normal design.

To launch the Import Expression Data tool, go to:

Toolbox | Ingenuity Pathway Analysis | Import Expression Data

Figure 3.2 shows the Import Expression Data dialog.

Image count_matrix_wizard
Figure 3.2: Parameters available in the Import Expression Data tool. Select the Table file containing the expression matrix and select the type of data matching the values in the file (in this case it contains count data). Add references to import against appropriate gene or transcript annotations. Select how to handle unmatched genes or transcripts.

In the Expression Data section of the dialog that opens, first select the data matrix by using the Browse button.

Select the expression values that matches the expression data type. All value types must be non-negative values:

When selecting TPM or RPKM, the expected minimum count must be specified. The value must be the smallest count value that was present in the expression matrix when calculating the TPMs or RPKMs values. In unfiltered data this value will typically be 1 (default).

Under References, specify how expression values were generated. This is for defining whether it was generated as a gene or transcript matrix as well as to specify how the TPM/RPKM were calculated.

The key is that you specify the Gene and mRNA tracks that were used to generate the expression values. When selecting Genes with accompanying transcripts as parameter you can choose to calculate expression for genes without transcript. This will result in the generation of a transcript that is expected to have the length of the full gene. Enabling this option allows calculation of TPM and RPKM when counts have been supplied.

At the bottom of the dialog, specify how unmatched genes or transcripts should be handled. An unmatched gene/transcript is either not found or ambiguous in the provided track. Unmatched gene/transcripts can be ignored or cause the import to fail. When importing raw counts, they can also be included. However, when importing TPM or RPKM, a match in the track is needed for translating the expression to counts.

The Import Expression Data tool outputs one expression track per samples.



Subsections