Alpha Diversity
Alpha diversity is the diversity within a particular area or ecosystem, usually expressed by the number of species (i.e., species richness) in that ecosystem. Alpha diversity estimates are calculated from a series of rarefaction analyses and hence dependent on sampling depth.
The Alpha diversity tool takes abundance tables as input. Abundance tables can be generated in the workbench by the following three tools: OTU clustering, Build Functional Profile and Taxonomic Profiling. With the first two tools, the abundance tables generated are count-based, and Alpha diversity measures calculated from such tables give an absolute number of species. However, when using an abundance table generated by the Taxonomic Profiling tool, Alpha diversity results will not give an absolute number of species, but rather estimates that are useful for comparative studies, i.e., to assess the depth of sequencing, or to compare different communities.
To run the tool go to
Microbial Genomics Module () | Metagenomics () | Abundance Analysis () | Alpha Diversity ()
Choose an abundance table to use as input. The next wizard window offers you to set up different analysis parameters (figure 7.1). For example, you can select which diversity measures to calculate (see Alpha diversity measures), and parameterize the rarefaction analysis. If you are working with OTU abundance tables, you can specify an appropriate phylogenetic tree for computing phylogenetic diversity. In that case, you must have aligned the OTUs and constructed a phylogeny before running the Alpha Diversity tool.
Figure 7.1:
Set up parameters for the Alpha Diversity tool.
The rarefaction analyses are done differently depending on the type of abundance table used as input. For OTU and functional abundance tables, where abundances are counts, rarefaction is calculated by sub-sampling the abundances in the different samples at different depths. For whole metagenome taxonomic profiling abundance tables, where abundances are coverage estimates, sub-sampling is not possible, so diversity is estimated using a probabilistic model corresponding to our qualification criteria instead (see section 5.1).
The rarefaction analysis parameters will define the granularity of the alpha diversity curve.
- Minimum depth to sample is set to 1 by default.
- Maximum depth to sample If this option is not checked, the maximum depth is set it to the total number of reads (in the case of one sample) or the total number of reads of the sample with most reads.
- Numbers of points Number of different depths to be sampled. For example, if you choose to sample 5 depths between 1000 and 5000, the algorithm will sub-sample each sample at 1000, 2000, 3000, 4000, and 5000 reads.
- Replicates at each depth (for counts-based abundance tables only). How many times the algorithm sub-samples the data at each depth.
- Sample with replacement Whether the sampling should be performed with or without replacement.
The tool will generate a graph for each selected Alpha diversity measure (figure 7.2). Using the Lines and dots editor on the right hand side panel, it is possible to color samples according to groups defined by associated metadata.
Figure 7.2:
An example of Alpha Diversity graph based on phylogenetic diversity.
Note that the option "Show derived legend info" is enabled by default (figure 7.3). According to this setting, the legend(s) for which metadata categories happen to be "shared" for all items in the legend will display the dependencies between the different categories. In this example, the "Location" category determines Dot Type, and the "Antibiotic" category determines Line Color. For this particular data set, all samples with a specific location have the same antibiotic resistance. The "Show derived legend info" option enables the legends to show such implicit dependencies in the data. If such a visualization is not wished for, the option can be disabled, and the legend will show only the metadata category values that were explicitly selected in the right hand side panel.
Figure 7.3:
Example of the difference between having the "Show derived legend info" enabled or disabled. When enabled, the legend helps visualize that "location" and "antibiotic" are dependent for this particular data set.
Subsections