Subset is the word you are looking for.
I think already classified spam mails are fed into the model to analyze on its own why were they spam in the first place. Here manually we are feeding spam mails not words that compose spam mails. Hope it helps.
I refrained myself from asking this for some time, but looks like it's the time
Do you really get your code copyrighted or this is just ... you know...