Annotate tag experiment

Combining the tag counts (Image count_tags) from the experimental data (see Extract and count tags) with the virtual tag list (Image create_tag_library) (see Create virtual tag list) makes it possible to put gene or transcript names on the tag counts. The Workbench simply compares the tags in the experimental data with the virtual tags and transfers the annotations from the virtual tag list to the experimental data.

This is done on an experiment level (experiments are collections of samples with defined groupings, see Experimental design):

        Toolbox | Transcriptomics Analysis (Image expressionfolder) | Expression Profiling by Tags (Image tag_folder) | Annotate Tag Experiment (Image annotate_tag_sample)

You can also access this functionality at the bottom of the Experiment table (Image experiment) as shown in figure 28.43.

Image annotate_tag_sample_button
Figure 28.43: You can annotate an experiment directly from the experiment table.

This will open a dialog where you select a virtual tag list (Image create_tag_library) and an experiment (Image experiment) of tag-based samples. Click Next when the elements are listed in the right-hand side of the dialog.

This dialog lets you choose how you want to annotate your experiment (see figure 28.44).

Image annotate_tag_sample_step2
Figure 28.44: Defining the annotation method.

If a tag in the virtual tag list has more than one origin (as shown in the example in figure 28.42) you can decide how you want your experimental data to be annotated. There are basically two options:

Annotate all
This will transfer all annotations from the virtual tag. The type of origin is still preserved so that you can see if it is a 3' external, 5' external or internal tag.
Only annotate highest priority
This will look for the highest priority annotation and only add this to the experiment. This means that if you have a virtual tag with a 3' external and an internal tag, only the 3' external tag will be annotated (using the default prioritization). You can define the prioritization yourself in the table below: simply select a type and press the up (Image arrow_up) and down (Image arrow_down) arrows to move it up and down in the list. Note that the priority table is only active when you have selected Only annotate highest priority.

Click Next to choose how you want to tags to be aligned (see figure 28.45).

Image annotate_tag_sample_step3
Figure 28.45: Settings for aligning the tags. When the tags from the virtual tag list are compared to your experiment, the tags are matched using one of the following options:

Require perfect match
The tags need to be identical to be matched.
Allow single substitutions
If there is up to one mismatch in the alignment, the tags will still be matched. If there is a perfect match, single substitutions will not be considered.
Allow single substitutions or indels
Similar to the previous option, but now single-base insertions and deletions are also allowed. Perfect matches are preferred to single-base substitutions which are preferred to insertions, which are again preferred to deletions. 28.7
If you select any of the two options allowing mismatches or mismatches and indels, you can also choose to Prefer high priority mutant. This option is only available if you have chosen to annotate highest priority only in the previous step (see figure 28.44). The option is best explained through an example:
\begin{figure}\begin{verbatim}Tag from experiment: CGTATCAATCGATTAC
\vert\ver...
...from virtual tag list (3' external): CCTATCAATCGATTAC\end{verbatim}
\end{figure}
In this case, you have a tag that matches perfectly to an internal tag from the virtual tag list. Imagine that in this example, you have prioritized the annotation so that 3' external tags are of higher priority than internal tags. The question is now if you want to accept the perfect match (of a low priority virtual tag) or the high-priority virtual tag with one mismatch? If you check the Prefer high priority mutant, the 3' external tag in the example above will be used rather than the perfect match.

Click Next if you wish to adjust how to handle the results. If not, click Finish. . This will add extra annotation columns to the experiment. The extra columns corresponds to the columns found in your virtual tag list. If you have chosen to annotate highest priority-only, there will only be information from one origin-column for each tag as shown in figure 28.46.

Image counttable-annotated-priority
Figure 28.46: An experiment annotated with prioritized tags. The CLC Genomics Workbench is able to analyze expression data produced on microarray platforms and high-throughput sequencing platforms (also known as Next-Generation Sequencing platforms). The CLC Genomics Workbench provides tools for performing quality control of the data, transformation and normalization, statistical analysis to measure differential expression and annotation-based tests. A number of visualization tools such as volcano plots, MA plots, scatter plots, box plots, and heat maps are used to aid the interpretation of the results.



Footnotes

... deletions.28.7
Note that if you use color space data, only color errors are allowed when choosing anything but perfect match.