Minimum Description Length

Let be a hypothesis from a Hypothesis Space and a set of examples, then the description length of is computed a follows

  • encode the hypothesis as a turing machine program → count bits
  • count data bits
    • correct prediciton → zero bit
    • wrong prediction → bits in the size of the error

MDL minimizes the total number of bits required.