Remove weak edges
The de Bruijn graph is expected to contain artifacts from errors in the data. The number of reads agreeing upon an error is likely to be low especially compared to the number of reads without errors for the same region. When this relative difference is large enough, it's possible to conclude something is an error.In the remove weak edges phase we consider each node and calculate the number of edges connected to the node and the number of times a read is passing through these edges. An average of reads going through an edge is calculated and then the process is repeated using only those edges which have more than or equal reads going though it. Let be the number of edges which meet this requirement and the number of reads passing through these edges. A second average is used to calculate a limit,
and each edge connected to the node which has less than or equal number of reads passing through it will be removed in this phase.