Custom Sets
The References Management offers a Custom Sets tab that allows you to store and generate your own sets of reference data (figure 8.8).
Custom Sets (a collection of chosen reference elements) can be used as reference data when running workflows:
- if the workflow inputs have been configured with workflow roles (see http://resources.qiagenbioinformatics.com/manuals/clcgenomicsworkbench/current/index.php?manual=Configuring_workflow_inputs.html).
- and the reference elements have been assigned matching workflow roles.
The Workflow Role of a Reference Data Element create the link between the workflow input and the reference data element in the custom set.
Figure 8.8: The Custom Sets wizard.
The principle of creating a custom Reference Data Set is to specify successively the different roles needed, followed by specifying the data fitting to the elements specified.
One way to create a custom reference data set is to click on Create. After naming the data set and providing a description for this data set, you can
- Select successively the roles that are needed by the workflow using the drop down menu.
- Create new roles by simply typing in the field a new role name as seen in figure 8.9. Custom roles are especially useful when working with complex workflows that include many different inputs.
Figure 8.9: The Create Custom Sets dialog showing a newly created role, and the drop down menu of already existing roles. - Use a pre-existing workflow to list roles needed: click on Create and then Add to match workflow ... (figure 8.10). You can then select an installed workflow (from the drop down list) or a workflow from the Navigation Area (to be specified in the "Custom workflow" field) for which roles have been assigned during configuration. The workflow roles specified in the workflow will then be listed here. The first line "Workflow role" can be checked/unchecked as a way to select/deselect all items in the list. Once you have picked the items you wish to include in your custom data set, click OK to proceed.
Figure 8.10: Select the workflow for which you need a custom Reference Data Set.
Now all the needed reference roles are listed in the main window (figure 8.11). For each role, you will specify the relevant reference element by clicking on the Browse icon to the right of the field - or delete the particular role from the Reference Data Set by clicking on the cross icon.
Figure 8.11: Select the elements for each role included in your custom Reference Data Set.
There are two ways to find a particular element: by using the Navigation Area tab or from the Reference Data tab (figure 8.12). In the Navigation Area tab, the references elements are sorted by role, in organisms specific subfolders, in the CLC_References folder (local or on server). In the Reference Data tab, the elements are sorted by role in Data Sets specific folders, including custom data sets.
Figure 8.12: Find the relevant reference element, either from the Navigation Area tab or from the Reference Data tab.
Note that it is only possible to select an element whose role is the one defined by the list: for example, when browsing for 1000_genomes_project files, it will not be possible to specify a genes track. However, it is possible to create new roles as needed: just type in the name of the custom role in the field and click Add element. There is no restriction on the type of file that can be selected for custom roles.
Once you are ready to save your custom data set, give it a name, and indicate in the description the workflow it is suitable for before clicking on OK. The custom set is now listed to the left side of the wizard. Saving a custom data set on a server is the most efficient way to share it with all other server users. Server admins can choose to lock or unlock reference data set on the server location so that the data sets cannot be deleted or modified.
Subsections