Purpose of activation functions

4 min readMay 10, 2022

What is an activation function ?

An activation function is a function who indicates when a neuron should be activated or not.

In biological neural network (BNN), neurons receive signals. They decide by themselves if they should transmit the signal in function of some criteria.

In artificial neural network (ANN), neurons do the same but in answering to a function: the activation function.

A neural network is characterized by its inputs, its hidden layers of neurons and its outputs.

Each layers, could have different activation function.

Why do a neural network need an activation function ?

A neural network without an activation function is essentially just a linear regression model. The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks.

Imagine that each neuron is like a bucket. It’s “inputs” are spurts of water from faucets. Now imagine that there is a little mechanism that detects if the bucket overflows. When the water overflows — or the sum of the neuron’s inputs crosses a threshold — then the mechanism triggers the faucets above other buckets to open. That’s how the bucket-neuron’s inputs get converted into outputs.

Backward and forward

The “forward pass” refers to calculation process, values of the output layers from the inputs data. It’s traversing through all neurons from first to last layer.

And then “backward pass” refers to process of counting changes in weights (de facto learning), using gradient descent algorithm (or similar). Computation is made from last layer, backward to the first layer.

Backward and forward pass makes together one “iteration”.

Different existing activation functions

Here are some activation functions:

Binary activation function

Formula:

Example:

When you toss a coin, you only have two options: either heads or tails. It is a binary function.

Graph:

Backward: False

Forward: True

Multi-class: False

Linear activation function

Formula:

Example:

Graph:

Backward: False (because derivative is constant)

Forward: True

Multi-class: True

Non-linear activation functions

Non-linear activation functions solve the following limitations of linear activation functions:

They allow backpropagation because now the derivative function would be related to the input, and it’s possible to go back and understand which weights in the input neurons can provide a better prediction.
They allow the stacking of multiple layers of neurons as the output would now be a non-linear combination of input passed through multiple layers. Any output can be represented as a functional computation in a neural network.
ReLU activation function

Formula:

Example:

Take the example of a car. If one accelerates in a constant way (without jolts, and without pressing excessively on the accelerator), the vehicle will continue to accelerate according to a linear curve (while knowing that it cannot have a negative speed).

Graph: