Reference Data Sets and defining Custom Sets

Reference Data Sets are particularly useful when used with workflows that make use of multiple reference data elements. The connection between Reference Data Sets and workflow inputs happens through "workflow roles". A role is assigned to each element in a Reference Data Set. Workflow inputs requiring reference data can then be configured with a workflow role, instead of, or as well as, a specific data element. Workflow roles are described in the context of configuring Input elements in Configuring input and output elements. That information also pertains to configuring input channels requiring reference data.

When launching workflows where workflow roles have been configured for reference data inputs, the reference data to use can be specified by selecting a Reference Data Set, instead of selecting each data element individually as you step through the launch wizard (figure 11.10).

Image choose_custom_refset_on_wf_launch
Figure 11.10: Reference Data Sets containing all the workflow roles specified in a workflow are available for selection in the launch wizard.

Reference Data Sets containing some commonly used reference data are available for download under the QIAGEN Sets tab of the Reference Data Manager (see QIAGEN Sets). It is easy to create new sets, known as Custom Sets, that refer to the data of your choice.

Creating Custom Sets

To create a Custom Set, you can:

Base it on an existing Reference Data Set
To do this, select a Reference Data Set and click on the Create Custom Set... button above the listing of data elements, on the right. This opens the "Create Custom Data Set" dialog, populated with the roles defined in the selected Reference Data Set, and any specified data elements (figure 11.11). This new set can then be customized.
Build it from scratch
To start from scratch, click on the Custom Sets tab at the top of the Reference Data Manager and then click on the Create button on the right. This opens the "Create Custom Data Set" dialog without any roles or elements predefined (figure 11.12).
Base it on reference data used in a specific workflow
To do this, open the "Create Custom Data Set" dialog using one of the methods described above, and then click on the Add to Match Workflow... button. You can specify an installed workflow from a drop-down list, or select a workflow from the Navigation Area using the "Workflow design" field (figure 11.13). If buttons are disabled, it usually means the selected workflow does not contain inputs defined with workflow roles.

Image create_custom_set_from_qiagen_set
Figure 11.11: After selecting a QIAGEN Set, click on the Create Custom Set button on the right hand side to open the Create Custom Data Set dialog populated with the roles and elements of that reference set.

Image create_custom_set_from_scratch
Figure 11.12: Under the Custom Sets tab, click on the Create button, on the right, to open the Create Custom Data Set dialog without any roles or elements predefined.

Image create_custom_set_from_wf_defined
Figure 11.13: Click on the Add to Match Workflow button in the Create Custom Data Set dialog to populate the dialog with the roles and elements defined in a workflow.

When basing a Custom Set on an existing Reference Data Set or on the references defined in a workflow, any predefined data elements will be listed in the Item(s) column of the relevant roles. Data elements can be selected or updated by double-clicking on the cells in that column.

You can define new roles in Custom Sets, or assign roles already in use in existing Reference Data Sets (figure 11.14). Note that workflow role names cannot contain spaces. If a workflow role is used in template workflows, there may be restrictions on the type of elements that can be selected. For example, when browsing for an element to associate with the 1000_genomes_project role, tracks of other types, like genes tracks or sequence tracks, cannot be selected.

Image create_custom_set_add_role
Figure 11.14: The Create Custom Sets dialog showing a newly created role, and the drop down menu of already existing roles.

Once saved, the new Custom Set will be listed under the Custom Sets tab of the Reference Data Manager. These sets will also be available to select via the launch wizards of workflows that have relevant roles defined for reference inputs.

Searching for data available under the Custom Sets tab

Use the search field under the top toolbar to search for terms in Custom Reference Sets. To search for just an exact term, put the term in quotes.

Hover the cursor over a hit to see what aspect of the result matched the search term (figure 11.15). Double-click on a search result to open it.

Image custom_sets_search_mouseover
Figure 11.15: Terms entered in the search field when the Custom Sets tab is selected are searched for in the sets available under that tab. Hovering the cursor over a hit opens a tooltip with information about the match.



Subsections