Activation Functions
An activation function is simply some function we apply to each of a layer’s outputs (its activation). the most common activation function is the rectifier function \(max(0, x)\).
Rectifier function
The rectifier function has a graph that’s a line with the negative part “rectified” to zero. Applying the function to the outputs of a neuron will aad a bend in the data, moving us away from simple lines.
when we attach the rectifier to a linear unit, we get a Rectified Linear Unit or ReLU. Applying a ReLU activation function to a linear unit means the output becomes \(max(0, w \dot x + b)\) which we might draw in a diagram like: