paddleaudio.metric.mcd

paddleaudio.metric.mcd.mcd_distance(xs: numpy.ndarray, ys: numpy.ndarray, cost_fn: Callable = mcd.metrics_fast.logSpecDbDist) float[source]

Mel cepstral distortion (MCD), dtw distance.

Dynamic Time Warping. Uses dynamic programming to compute:

Examples

wps[i, j] = cost_fn(xs[i], ys[j]) + min(
                wps[i-1, j  ],  // vertical   / insertion / expansion
                wps[i  , j-1],  // horizontal / deletion  / compression
                wps[i-1, j-1])  // diagonal   / match

dtw = sqrt(wps[-1, -1])

Cost Function: .. rubric:: Examples

logSpecDbConst = 10.0 / math.log(10.0) * math.sqrt(2.0)

def logSpecDbDist(x, y):
    diff = x - y
    return logSpecDbConst * math.sqrt(np.inner(diff, diff))
Parameters
  • xs (np.ndarray) -- ref sequence, [T,D]

  • ys (np.ndarray) -- hyp sequence, [T,D]

  • cost_fn (Callable, optional) -- Cost function. Defaults to mt.logSpecDbDist.

Returns

dtw distance

Return type

float