Maximum Likelihood Estimation

From all possible $ϕ$ parameters we will select the ones which most likely generated the training set. (selecting the maximum depending on the Dataset) $h_{i} ar g max P (d ∣ h_{i})$ = MAP Learning for a uniform Prior.

This method uses the Log-Likelihood function. It has to be maximized to get to its minimum (negative sign).

Solve with Newton Method:

So we have:

$θ^{k + 1} = θ^{k} - \nabla^{2} L (θ^{k})^{- 1} \nabla L (θ^{k})$ So basically current parameters minus inverse of the hessian matrix times the Gradient.

One can use different methods to calculate the minimum:

Gradient Descent
Newtons Method
Analytical solution

Marcs Notes

Explorer

Maximum Likelihood Estimation

Maximum Likelihood Estimation

Graphansicht

Backlinks