for the question 8 I wrote following code:
def check(data):answ = np.array([False for i in range(data.shape)])for x in range(1,data.shape-1):answ = answ | np.array(data[:,x]==0)return answprint(data[check(data)].shape)
which gives me answer of 638. this answer was not in the list, so I choose randomly one of them. After that I checked your code and found that last column ("Outcome") is not participate in the calculations. I assumed that "Outcome" is also feature which may be unknown. So my question is why we do not use nan values for unknown values instead of 0-s?
P.S. Sorry for my English