What is an activation function ?
An activation function is a function who indicates when a neuron should be activated or not.
In biological neural network (BNN), neurons receive signals. They decide by themselves if they should transmit the signal in function of some criteria.
In artificial neural network (ANN), neurons do the same but in answering to a function: the activation function.
A neural network is characterized by its inputs, its hidden layers of neurons and its outputs.
Each layers, could have different activation function.
Why do a neural network need an activation function ?
A neural network without an activation function is essentially just a linear regression model. The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks.
Imagine that each neuron is like a bucket. It’s “inputs” are spurts of water from faucets. Now imagine that there is a little mechanism that detects if the bucket overflows. When the water overflows — or the sum of the neuron’s inputs crosses a threshold — then the mechanism triggers the faucets above other buckets to open. That’s how the bucket-neuron’s inputs get converted into outputs.
Backward and forward
The “forward pass” refers to calculation process, values of the output layers from the inputs data. It’s traversing through all neurons from first to last layer.
And then “backward pass” refers to process of counting changes in weights (de facto learning), using gradient descent algorithm (or similar). Computation is made from last layer, backward to the first layer.
Backward and forward pass makes together one “iteration”.
Different existing activation functions
Here are some activation functions:
Binary activation function
Formula:
Example:
When you toss a coin, you only have two options: either heads or tails. It is a binary function.
Graph:
Backward: False
Forward: True
Multi-class: False
Linear activation function
Formula:
Example:
Graph:
Backward: False (because derivative is constant)
Forward: True
Multi-class: True
Non-linear activation functions
Non-linear activation functions solve the following limitations of linear activation functions:
- They allow backpropagation because now the derivative function would be related to the input, and it’s possible to go back and understand which weights in the input neurons can provide a better prediction.
- They allow the stacking of multiple layers of neurons as the output would now be a non-linear combination of input passed through multiple layers. Any output can be represented as a functional computation in a neural network.
- ReLU activation function
Formula:
Example:
Take the example of a car. If one accelerates in a constant way (without jolts, and without pressing excessively on the accelerator), the vehicle will continue to accelerate according to a linear curve (while knowing that it cannot have a negative speed).
Graph:
Backward: True
Forward: True
Multi-class: True (more efficient than others)
- Sigmoid (logistic) activation function
Formula:
Graph:
Backward: True
Forward: True
Multi-class: True (plus elle a de labels, moins elle est précise car plus les calculs seront complexes)
- Tanh (or hyperbolic) activation function
Formula:
Graph:
Backward: True
Forward: True
Multi-class: True
- SoftMax activation function
Formula:
Backward: True
Forward: True