A line in first paragraph says "Defining P(y=1|x)=sigmoid(z) is equivalent to defining P(y=1|x) = softmax(z)1 with two-dimensional z and z1=0."
Could anybody help me understand this line?
A line in first paragraph says "Defining P(y=1|x)=sigmoid(z) is equivalent to defining P(y=1|x) = softmax(z)1 with two-dimensional z and z1=0."
Could anybody help me understand this line?