Adaptive Linear Unit
Similar to a Perceptron with the main difference being that we introduce an Activation Function in front of the threshold function.
We calculate the error based on the Activation Function output and not on the threshold function output.
Also it uses a Loss Function that is being minimized by a Gradient Descent algorithm.