Maximum A Posteriori Learning

Instead of Full Bayesian Learning we try to choos the MAP hypothesis $h_{MAP}$ that maximizes $P (h_{i} ∣ d) .$ This can be rewritten using Bayes Rule to $P (d ∣ h_{i}) \cdot P (h_{i})$ (we can leave out $P (d)$ as it is constant for all hypothesis. We could in theory even leave out $P (h_{i})$ when the dataset is large → ML Learning).

or even better by utilizing the properties of the logarithm $lo g_{2} (P (d ∣ h_{i})) + lo g_{2} (P (h_{i})) .$ So we get $h_{MAP}$ with $h_{i} ar g max lo g_{2} (P (d ∣ h_{i})) + lo g_{2} (P (h_{i}))$ which requires solving an optimization problem instead.

Predictions The predictions made this way are approximately Bayesian to the extent that

P (X ∣ d) \approx P (X ∣ h_{MAP})

This means, for predictions we have to only compute this one probability and do not have to sum over all possible hypothesis.

Marcs Notes

Explorer

Maximum A Posteriori Learning

Maximum A Posteriori Learning

Graphansicht

Backlinks