Definition of sample completion
Before the connector can submit a sample to a workflow, it has to know when the sample is complete, meaning that all sample files belonging to the sample have been completely written to the sequencer's output folder. The connector supports two different methods for determining if a sample is complete:
- Copy-complete detection
- Files-per-sample detection
Copy-complete detection
Copy-complete detection is the preferred sample completion method, but it requires that the sequencer outputs a file to the output folder of the given sequencing run that signals that the sequencer is done writing all sample files to disk. The copy-complete file must be located in a folder belonging to the same folder tree as the sample files (see example in figure 4.1).
The benefits of copy-complete detection are:
- The connector does not have to check each individual file to see if it has been completely written to disk. This check requires that the operating system is able to determine if a file is currently being written to, and this is not supported in certain scenarios.
- It is not necessary to set a specific number of files per sample produced by the sequencer. The sequencer can thus be reconfigured to produce a different number of files per sample without also having to update the workflow's automation configuration.
Files-per-sample detection
If the sequencer does not support copy-complete detection, the fallback method is defining how many files the sequencer produces per sample. Caution should be taken when using this method. If the sequencer is reconfigured to produce a different number of files per sample, the workflow's automation configuration has to be updated accordingly before a new sequencing run is started.
If files-per-sample detection is used, the operating system must be able to determine if a file is currently being written to. See System requirements for information about platform specific system requirements.