Update Sequence Attributes

Update Sequence Attributes adds or updates information in attribute columns of input sequences and outputs new sequence elements containing the modified attribute information.

Note that attributes relating to characteristics of the sequences, such as length or the start of the sequences, cannot be updated using this tool.

To launch the Update Sequence Attributes tool, go to:

        Tools | Utility Tools (Image utilities_closed_16_n_p) | Sequences (Image sequence_lists_folder_closed_16_n_p) | Update Sequence Attributes (Image update_sequence_list_attributes_16_n_p)

The tool takes as input one sequence (Image sequence_dna) (Image sequence_rna) (Image sequence_protein2), sequence list (Image seq_list_nucleotide) (Image seq_list_protein), alignment (Image alignment), or phylogenetic tree (Image tree), and is recommended when updating information for several attributes/sequences. Individual attributes can also be updated directly in the Table view of these elements.

The following options can be configured in the Options dialog (figure 27.33):

Image seq-attrs-to-update
Figure 27.33: Information in the "Attributes.tsv" file will be matched with the relevant sequences based on content of the Name column in the file and in the input sequences. Six columns containing relevant attribute information have been selected.

The result of the choices made in the Options wizard step are reflected in the Preview step (figure 27.34). In the upper pane is a list of the attribute types to be updated or added, as well as the attribute to be used to match sequences with the relevant information. How particular columns will be handled is indicated in the "Content handling" column, including whether validation will be applied. The columns subject to validation checks are described later in this section.

Shown in the lower pane is a small subset of the incoming information from the attribute file, based on the choices made in the Options wizard step. Click on the "Previous" button to go back to that step if anything needs to be adjusted.

Image seq-attr-validation
Figure 27.34: The Preview wizard steps shows information about how columns from the attribute file will be handled, and whether any problems were detected. Where validation checks are carried out, if any had failed, a yellow exclamation mark in the bottom pane would be shown for that column. Here, all entries pass. The "Other" column is not subject to validation checks. Only one sequence in the list is being updated in this example.

Column headings and value validation

Certain column names are recognized by the software and validation rules are applied to these. When the contents pass the validation checks, entries in those columns may be further processed.

In most cases, this further processing involves adding hyperlinks to online data resources. However, the contents of columns named Gene ID trigger different handling:

Other columns where contents are validated are those with the headings listed below. If a value in such a column cannot be validated, it is not added nor used to update attributes.

If you wish to add information of this type but do not want this level of validation applied, use a heading other than the ones listed below.