Skip to main content

An Intro to Neural Network Initialization With Keras

This article provides a short tutorial on how you can use initialization methods for Neural Networks in Keras, with a Google Colab to help you follow along.
Created on June 7|Last edited on February 2

In this article, we'll walk through how you can use various initialization methods in Neural Networks to help make your model perform better.
For a quick review of the various methods to do weight initialization and the motivations as to why certain methods are preferred, refer to this review article:


Table of Contents



Lastly, before jumping in, if you'd like to follow along with this piece in a Colab with executable code, you can find that right here:


The Code For Applying Initialization Techniques In Keras

Keras makes it extremely easy to use different initialization techniques in various Keras Layers, such as a Long Short Term Memory Unit or a Dense Layer with the tf.keras API. The TensorFlow API provides various arguments which allow for quick prototyping.
Those are:
  • glorot_normal and glorot_uniform: These arguments allow us to use the Glorot normal/uniform initializer, also called Xavier normal/uniform initializer.
  • he_normal and he_uniform: These arguments allow us to use the He normal/uniform initializer.
  • lecun_normal and lecun_uniform: These arguments allow us to use the LeCun normal/uniform initializer.
  • random_normal and random_uniform: These arguments allow us to use the Random normal/uniform initializer.
  • zeros: This argument allow us to use the zero initializer.
  • identity: This argument allow us to use the Identity Initializer
NOTE: In order to get a deeper understanding of which initializer to use please refer to the aforementioned article.
💡
Let's see how we can use these in practice! For this example, we'll work with a Recurrent Neural Network (RNN).
from wandb.keras import WandbCallback

model = keras.Sequential([
layers.LSTM(..., kernel_initializer='glorot_uniform'), # <- Substitute
# ...
])

model.compile(...)

model.fit(
x_train, y_train, ...
callbacks = [WandbCallback()]
)

The Results

Now that you've seen how to use various initialization methods let's see how we can use the Weights & Biases Keras Callback to easily visualize and compare them using Panels. For example, here's a quick comparison of Glorot Uniform, He Uniform, and LeCun Uniform, which you'll find linked in the Colab above:

Run set
3

As we can see from the plots, most initializers perform the same except for minor early improvements to the validation accuracy. To further improve the metrics, you can try hyperparameter tuning by changing the batch size or the number of units.
Weights & Biases Sweeps makes this incredibly easy by automatically running your pipeline using an agent. For more details, please refer to our Sweeps Quickstart Guide.
If you'd like to try this yourself, here's the Colab to do so:



Summary

In this article, you saw how you can implement various Kernel Initialization Methods in Neural Networks using the Keras Framework and how the use of Weights and Biases allows you to easily compare the various types of kernel initializers. To see the full suite of W&B features, please check out this short 5 minutes guide.
If you want more reports covering the math and "from-scratch" code implementations, let us know in the comments below or on our forum ✨!
Check out these other reports on Fully Connected covering other fundamental development topics like GPU Utilization and Saving Models.

Iterate on AI agents and models faster. Try Weights & Biases today.