Simplifying filter cascades
Prior to version 26.0 of Biomedical Genomics Analysis, DNA workflows performing variant detection typically relied on long filter cascades to remove likely false positive variants. These cascades consisted of multiple filter units, each containing:
- One or more instances of Filter on Custom Criteria to identify likely false positive variants, followed by
- Filter against Known Variants to remove these variants.
After the filter units, the filter cascade typically also contained:
- A Remove Marginal Variants element, which was retired in version 26.0 and replaced by Filter on Custom Criteria.
- A Filter Homozygous Reference Variants element.
From version 26.0 of CLC Genomics Workbench, Filter on Custom Criteria supports complex filtering strategies, allowing workflows to be streamlined so that filtering can be performed in a single step.
This section uses a copy of the Identify QIAseq DNA Somatic Variants template workflow provided by version 25.0 of Biomedical Genomics Analysis to illustrate how such workflows can be updated.
Before making any changes, we recommend saving a copy of the workflow. This allows verifying that the updated workflow produces the same results as the original.
Identify filter units
Locate the start of the filter cascade and each of the filter units within it (figure 12.6). It can be useful to create a workflow group for each unit.
Figure 12.6: The filter cascade from "Identify QIAseq DNA Somatic Variants (Illumina)" template workflow distributed with version 25.0. Each filter unit is placed in a collapsed workflow group. The units are followed by a "Filter on Custom Criteria" element and a "Filter Homozygous Reference Variants" element.
Add to the workflow a Filter on Custom Criteria element, referred to here as the consolidated filter step. This element will ultimately replace all filter units and will contain one filter group for each unit.
Incorporate filter units
For all Filter on Custom Criteria elements in the unit using Match all, or using Match any with only one criterion:
- For each element:
- Select all criteria, right-click and choose Copy, or use the keyboard shortcut Ctrl (
on Mac) + C.
- In the consolidated filter step, select the bottom empty filter criterion, right-click and choose Paste, or use the keyboard shortcut Ctrl (
on Mac) + V.
- See Saving and reusing filter sets for details about copying and pasting criteria. Note that the options located under the More... button are not suitable for this task.
- Select all criteria, right-click and choose Copy, or use the keyboard shortcut Ctrl (
- Select all newly added criteria in the consolidated filter step corresponding to this unit, right-click, and choose Group Selected Criteria. Ensure Match all is selected in the resulting group. We recommend adding a description annotation using the right-click menu (figure 12.7). See Grouping criteria for more details.
Figure 12.7: The consolidated filter step at the bottom contains one expanded group that is equivalent to the "Minimum QUAL" filter unit at the top, using only "Match all". The annotation description is based on the name of the "Filter against Known Variants" element from the same unit.
If the unit also contains Filter on Custom Criteria elements using Match any with more than one criterion:
- Apply the steps described above to elements using Match any. An annotation description may be omitted.
- Select Match any for this group.
- Drag and drop this group into the previously created group so that it becomes a subgroup (figure 12.8).
Figure 12.8: The consolidated filter step at the bottom contains two collapsed groups corresponding to earlier units, and one expanded group that is equivalent to the "Low count" filter unit at the top, using both "Match all" and "Match any" with more than one criterion. The annotation description is based on the name of the "Filter against Known Variants" element from the same unit.
When all Filter on Custom Criteria elements in the unit have been incorporated, we recommend collapsing the group.
Finalize the consolidated filter step
When all filter units have been incorporated:
- In the consolidated filter step (figure 12.9):
- Incorporate the last Filter on Custom Criteria using the same steps described above. Ensure that Match all or Match any is correctly selected.
- Select Match any in the top-level group.
- Delete the final empty criterion.
- Select Remove.
- Consider selecting Annotate instead of Filter. See Filter on Custom Criteria for details on how this impacts results. If using Filter, also select it in Filter Homozygous Reference Variants.
- Drag and drop the connection incoming to the first Filter on Custom Criteria element in the first unit to the input channel of the consolidated filter step.
- Drag and drop the connection outgoing from the last element before Filter Homozygous Reference Variants to the output channel of the consolidated filter step.
- Delete the filter cascade. The workflow layout can optionally also be adjusted.
Figure 12.9: The consolidated filter step has the entire filter cascade from figure 12.6 incorporated. The last group is expanded and corresponds to the last "Filter on Custom Criteria" element, which was not part of any unit and used "Match any".
