To start the clustering of features:
        Toolbox | Transcriptomics Analysis ( )| Feature Clustering | Hierarchical Clustering of Features (
)| Feature Clustering | Hierarchical Clustering of Features ( )
)
Select at least two samples ( ( ) or  (
) or  ( )) or an experiment  (
)) or an experiment  ( ).
).
Note! If your data contains many features, the clustering will take very long time and could make your computer unresponsive. It is recommended to perform this analysis on a subset of the data (which also makes it easier to make sense of the clustering. Typically, you will want to filter away the features that are thought to represent only noise, e.g. those with mostly low values, or with little difference between the samples). See how to create a sub-experiment in Creating sub-experiment from selection.
Clicking Next will display a dialog as shown in figure 28.94. The hierarchical clustering algorithm requires that you specify a distance measure and a cluster linkage. The distance measure is used specify how distances between two features should be calculated. The cluster linkage specifies how you want the distance between two clusters, each consisting of a number of features, to be calculated.
     
    Figure 28.93: Parameters for hierarchical clustering of features. 
At the top, you can choose three kinds of Distance measures:
 and
 and 
 ,
    then the Euclidean distance between
,
    then the Euclidean distance between  and
 and  is
 is
       
 
 and
 and 
 is defined as
 is defined as
 
 
 is the average of values in
 is the average of values in  and
 and  is the sample standard deviation of these values.
It takes a value
 is the sample standard deviation of these values.
It takes a value 
![$ \in [-1,1]$](img123.gif) . Highly correlated elements have a high absolute value of the Pearson correlation, and elements whose values are un-informative about each other have Pearson correlation 0. Using
. Highly correlated elements have a high absolute value of the Pearson correlation, and elements whose values are un-informative about each other have Pearson correlation 0. Using 
 as distance measure means that elements that are highly correlated will have a short distance between them, and elements that have low correlation will be more distant from each other.
 as distance measure means that elements that are highly correlated will have a short distance between them, and elements that have low correlation will be more distant from each other.
 and
 and 
 ,
    then the Manhattan distance between
,
    then the Manhattan distance between  and
 and  is
 is
       
 
Next, you can select different ways to calculate distances between clusters. The possible cluster linkage to use are:
 , where
, where  is an object from the first cluster and
 is an object from the first cluster and  is an object
  from the second cluster.
 is an object
  from the second cluster.
 , where
, where  comes from the first cluster,
  and
 comes from the first cluster,
  and  comes from the second cluster. In other words, the distance between two clusters is computed as the distance between the two farthest objects in the two clusters.
 comes from the second cluster. In other words, the distance between two clusters is computed as the distance between the two farthest objects in the two clusters.
At the bottom, you can select which values to cluster (see Selecting transformed and normalized values for analysis). Click Next if you wish to adjust how to handle the results. If not, click Finish.