To compute the number of reads in a sample mapping to regions involved with Pfam domains or GO terms, you can run the Build Functional Profile tool by going to:
Functional Analysis () | Build Functional Profile ()
In the first wizard (figure 14.5), select the read mapping for which you want to build the functional profile.
The parameters that can be set are seen in figure 14.6:
- Reference. A reference set of contigs annotated with Pfam domains and/or BLAST hits. If the read mapping contains an annotated genome, this parameter is optional.
- GO database. The GO database, used to map between Pfam domains and GO terms. The GO database can be downloaded using the Download GO Database tool (see section "Download GO database" ).
- GO subset. A subset of the GO database. Since many GO terms are too general or too specific, several meaningful subsets of GO terms are provided. See http://geneontology.org/docs/download-ontology/.
- Propagate GO mapping. When selected, Pfam annotations are mapped to the relative GO terms and all their more general terms. For example, the Pfam domain "CutC" maps to the GO term "0005507 // copper ion binding". If Propagate GO mapping is enabled, the tool would also map to more general GO terms such as "0055070 // copper ion homeostasis", "0055076 // transition metal ion homeostasis", and "0065007 // biological regulation".
You can then select which output elements should be generated figure 14.7.
- Create Pfam functional profile. Abundance table obtained by counting reads overlapping Pfam domains.
- Create GO functional profile. Abundance table obtained by counting reads overlapping GO terms. Note that a database must be specified in order to build a GO functional profile, as preexisting GO annotations on pfam domains are ignored by the tool.
- Create BLAST hit functional profile. Abundance table obtained by counting reads overlapping BLAST hits.
- Create DIAMOND hit functional profile. Abundance table obtained by counting reads overlapping DIAMOND hits.
- Create Report. A report stating statistics about the input reference contigs and read mapping, as well as the number of matches to each feature.
The resulting functional abundance tables store the number of reads corresponding to each Pfam domain, GO term, best BLAST hit or best DIAMOND hit.