Full-Batch Gradient Descent
A Gradient Descent algorithm that uses the entire dataset to calculate one weight update. The update is with
As you can see this can become computationally expensive when the size of the data set grows. For each update we would need to calculate all Gradients.
Faster methods are Mini-Batch Gradient Descent or Stochastic Gradient Descent which both do more frequent weight updates instead.
Proof
With Taylor Expansion we can show
E(w+\Delta w) & =E(w)+g^T \Delta w+\frac{1}{2} \Delta w^T G \Delta w \\ & =E(w)-\eta g^T g+\frac{\eta^2}{2} g^T G g<E(w) \quad \text { for } \eta \text { small } \end{aligned}$$ So, the error decreases with every weight update.