Minimum Description Length
Let be a hypothesis from a Hypothesis Space and a set of examples, then the description length of is computed a follows
- encode the hypothesis as a turing machine program → count bits
- count data bits
- correct prediciton → zero bit
- wrong prediction → bits in the size of the error
MDL minimizes the total number of bits required.