This hands-on assignment guides your through implementing spell checking (and correction) from scratch in Python. We are given about 25,000 words of text, out of which approximately 2500 are spelling errors. We will create a system to (a) detect which words are misspelled, and (b) suggest correct spellings for those words.
Overview
The model we will implement will is the first one described in the Spell Checking and Correction tutorial. That is, P_{c|w} \propto P_{c} \times P_{w|c} where P_c will be estimated using unigram probabilities from text, and P_{(w|c)} = k^{d(w,c)}, and d(w, c) denotes the edit distance.
The code included below can be run one after the other. The parts you need to write are marked with triple dots (...).