Square Weight Decay

Square Weigth Decay is a Regularisation method that enforces small weights and thus has a prior towards linear models as the weight values are in the linear range of the tanh function.

It works by adding the squared weights to the error function. Minimizing the error will also lead to smaller weights.

A small weight decay will sort out completely unneeded weights. This method is similar to LASSO where the end result is a kind of Feature Selection.