An Intro to Neural Network Initialization With Keras
This article provides a short tutorial on how you can use initialization methods for Neural Networks in Keras, with a Google Colab to help you follow along.
Created on June 7|Last edited on February 2
Comment

In this article, we'll walk through how you can use various initialization methods in Neural Networks to help make your model perform better.
For a quick review of the various methods to do weight initialization and the motivations as to why certain methods are preferred, refer to this review article:
Table of Contents
Lastly, before jumping in, if you'd like to follow along with this piece in a Colab with executable code, you can find that right here:
The Code For Applying Initialization Techniques In Keras
Keras makes it extremely easy to use different initialization techniques in various Keras Layers, such as a Long Short Term Memory Unit or a Dense Layer with the tf.keras API. The TensorFlow API provides various arguments which allow for quick prototyping.
Those are:
- glorot_normal and glorot_uniform: These arguments allow us to use the Glorot normal/uniform initializer, also called Xavier normal/uniform initializer.
- lecun_normal and lecun_uniform: These arguments allow us to use the LeCun normal/uniform initializer.
- random_normal and random_uniform: These arguments allow us to use the Random normal/uniform initializer.
NOTE: In order to get a deeper understanding of which initializer to use please refer to the aforementioned article.
💡
Let's see how we can use these in practice! For this example, we'll work with a Recurrent Neural Network (RNN).
from wandb.keras import WandbCallbackmodel = keras.Sequential([layers.LSTM(..., kernel_initializer='glorot_uniform'), # <- Substitute# ...])model.compile(...)model.fit(x_train, y_train, ...callbacks = [WandbCallback()])
The Results
Now that you've seen how to use various initialization methods let's see how we can use the Weights & Biases Keras Callback to easily visualize and compare them using Panels. For example, here's a quick comparison of Glorot Uniform, He Uniform, and LeCun Uniform, which you'll find linked in the Colab above:
Run set
3
As we can see from the plots, most initializers perform the same except for minor early improvements to the validation accuracy. To further improve the metrics, you can try hyperparameter tuning by changing the batch size or the number of units.
Weights & Biases Sweeps makes this incredibly easy by automatically running your pipeline using an agent. For more details, please refer to our Sweeps Quickstart Guide.
If you'd like to try this yourself, here's the Colab to do so:
Summary
In this article, you saw how you can implement various Kernel Initialization Methods in Neural Networks using the Keras Framework and how the use of Weights and Biases allows you to easily compare the various types of kernel initializers. To see the full suite of W&B features, please check out this short 5 minutes guide.
If you want more reports covering the math and "from-scratch" code implementations, let us know in the comments below or on our forum ✨!
Check out these other reports on Fully Connected covering other fundamental development topics like GPU Utilization and Saving Models.
Recommended Reading
How To Use GPU with PyTorch
A short tutorial on using GPUs for your deep learning models with PyTorch, from checking availability to visualizing usable.
PyTorch Dropout for regularization - tutorial
Learn how to regularize your PyTorch model with Dropout, complete with a code tutorial and interactive visualizations
How to save and load models in PyTorch
This article is a machine learning tutorial on how to save and load your models in PyTorch using Weights & Biases for version control.
Image Classification Using PyTorch Lightning and Weights & Biases
This article provides a practical introduction on how to use PyTorch Lightning to improve the readability and reproducibility of your PyTorch code.
LSTM RNN in Keras: Examples of One-to-Many, Many-to-One & Many-to-Many
In this report, I explain long short-term memory (LSTM) recurrent neural networks (RNN) and how to build them with Keras. Covering One-to-Many, Many-to-One & Many-to-Many.
Using LSTM in PyTorch: A Tutorial With Examples
This article provides a tutorial on how to use Long Short-Term Memory (LSTM) in PyTorch, complete with code examples and interactive visualizations using W&B.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.