Regularization in neural network

3 min readJun 6, 2022

L1 Regularization

Mechanics

layers = tf.layers.Dense(
100, activation="elu",
kernel_initializer="he_normal",
kernel_regularizer=tf.keras.regularizers.l1())

Pros

It will be more precise than its congener L1.
Avoid only the overfitting in big dataset.

Cons

Don’t avoid usually overfitting.
It is slower than its L2 congener.

L2 Regularization

Mechanics

layers = tf.layers.Dense(
100, activation="elu",
kernel_initializer="he_normal",
kernel_regularizer=tf.keras.regularizers.l2(0.01))

Pros

It is faster than its L2 congener.
Avoid the overfitting.

Cons

It will be less precise than its congener L1.

If you would like to use both of them, you could use tf.keras.regularizers.l1_l2() function.

Dropout

Mechanics

The principle of the dropout consists in being able to stop any neurons of any layer (except the output layer).

The extinction rate (p) is generally between 20 and 30 percent for recurrent neural networks. It can go up between 40 and 50 in convolutional neural networks.

Pros

More robust to fluctuation.

Cons

Overfitting can be seen when a large network is used with a small dataset.
More neuron’s excitation (increase weights).

Data Augmentation

Mechanics

The principle of data augmentation is to artificially augment the dataset by generating many realistic variants of each training instance.

The data generated must be as close as possible to the realistic data while being different. A shift (to the right, down, etc.) for example.

tf.nn.local_response_normalization()

Pros

Allow to train the model with less usable data

Cons

Increase training time

Early Stopping

Mechanics

The early stopping principle for gradient descent consists in interrupting the search for the minimum when the validation error reaches a certain threshold.

In the above image, we could see the error decrease until the early stopping point before growing again. If we don’t apply early stopping principle, at this point, the model begins to overfit.

Pros

Avoid the overfitting

Cons

If the threshold is not well defined, this regularization will stop the learning phase too early. Otherwise, it may be useless.

Regularization in neural network

L1 Regularization

Mechanics

Pros

Cons

L2 Regularization

Mechanics

Pros

Cons

Dropout

Mechanics

Pros

Cons

Data Augmentation

Mechanics

Pros

Cons

Early Stopping

Mechanics

Pros

Cons

Written by am2701