After Equation 2.67, the new paragraph says that since we're using the same matrix D to decode all the points, we can no longer consider the points in isolation. Then the author uses the Frobenius norm of the matrix instead of the L2 norm used before. So, why is that?

Also, what are the limits of summation in the norm and how are we defining the transformation on a scalar, ie r(x^{(i)})_{j}