Test a human, character-level language model on a given corpus
Test your language model!
Test the accuracy of your “language model” on a given corpus (sort of).
A statistical language model is a probability distribution over sequences of words. Given such a sequence, say of length m, it assigns a probability to the whole sequence.
(from Wikipedia)
The programme takes a plain text file, and splits it into non-overlapping chunks.
You then have to guess which character came next, after the chunk shown. Chunks are
show in random order.
The app is written in Elm, and can be compiled to Html (and javascript) like so:
elm make src/elm/Main.elm
open index.html