I think there is a little mistake or did I miss something? Wouldn't instead of Z[i] = Z[i-1] * W[i] + B[i] be better Z[i] = Z[i-1] * W[i-1] + B[i-1]?
I mean, when i=k, then what would be the value of W[k]? As in the comment of code you said W[i] are the weights of layer i (layer k) connected to layer i + 1 (layer k + 1). Considering that, there's no weights for any layer beyond the output (layer k).
I have to stick myself on to this question. Even widen it as in at the example code in lines 8 to 12, seems to be a light of the answer but with no enough clearness yet.
# standardize the data to make sure each feature contributes equally# to the distance