Extracting a subset of a database

After download, it is always possible to select a subset of bacterial genomes and saving the reduced list in a separate file. This can reduce significantly subsequent analysis runtime.

For example, from a collection of bacterial genomes that include multiple representatives of each genus, you can extract a genus specific subset of sequences to a new list:

  1. Open the downloaded bacterial genomes database.
  2. Switch to tabular element mode (Image table).
  3. Filter towards the desired genus (figure 18.8).

    Image subset_ncbi
    Figure 18.8: The downloaded NCBI bacterial genomes database was filtered for Salmonella data. A subset of 44 out of 2,253 sequences matched this search criterion.

  4. Select all remaining rows.
  5. Click the Create New Sequence List button.
  6. Save the subset reference list.