I am confused by equation 3.28 in section 3.9.5, used to define an empirical distribution. According to the text, when dealing with discrete variables, p(x) is simply the frequency of that value in the training set. So, for example, if there is a set of points (2, 2, 3, 4, 5), p(2) = 0.4. I can see how that would follow from equation 3.28.

But how does the situation change when dealing with continuous variables? For example, if I were using the same set of five points I gave above, but treated them as real numbers (and not just as integers), how would I calculate something like p(2.5)?