As per step 3 in pseudocode for gradient descent, we are updating all the weights. If we update single weight at a time and perform steps and then again update next weight, we might get much smaller cost. Am I correct?
Download: Run 3 online.
Read more… (45 words)