This hands-on assignment guides your through implementing language identification from scratch in Python. In particular, we have 10,000 snippets of text from various languages (about 50 total), where each snippet is exactly 100 characters long. We'll use 8000 of these snippets for learning patterns, and use the other 2000 to evaluate how accurately the system we make is able to make predictions (above 90%!).
Overview
All the code in this assignment can be run. Parts have been left for you to fill, marked with ... (triple dots).