LSTM RNN in Keras: Examples of One-to-Many, Many-to-One & Many-to-Many
In this report, I explain long short-term memory (LSTM) recurrent neural networks (RNN) and how to build them with Keras. Covering One-to-Many, Many-to-One & Many-to-Many.
Created on August 12|Last edited on March 8
Comment

There are principally the four modes to run a recurrent neural network (RNN).
One-to-One is straight-forward enough, but let's look at the others:
Let us see how a minimalistic code snippet for each of them looks in Keras
LSTMs can be used for a multitude of deep learning tasks using different modes. We will go through each of these modes along with its use case and code snippet in Keras.
Try the Experiments in Google Colab
What Are One-to-Many Sequence Problems?
One-to-many sequence problems are sequence problems where the input data has one time-step, and the output contains a vector of multiple values or multiple time-steps. Thus, we have a single input and a sequence of outputs.
A typical example is image captioning, where the description of an image is generated. Check out this amazing "Generate Meaningful Captions for Images with Attention Models" report by Rajesh Shreedhar Bhat and Souradip Chakraborty to learn more.
An Example Of A One-to-Many LSTM Model In Keras
We have created a toy dataset shown in the image below. The input data is a sequence of numbers, while the output data is the sequence of the next two numbers after the input number.

Let us train it with a vanilla LSTM. You can see the loss metric for the train and validation data, as shown in the plots.
model = Sequential()model.add(LSTM(50, activation='relu', input_shape=(1, 1)))model.add(Dense(2))model.compile(optimizer='adam', loss='mse')wandb.init(entity='ayush-thakur', project='dl-question-bank')model.fit(X, Y, epochs=1000, validation_split=0.2, batch_size=3, callbacks=[WandbCallback()])
When predicting it with test data, where the input is 10, we expect the model to generate a sequence [11, 12]. The model predicted the sequence [[11.00657 12.138181]], which is close to the expected values.
Run set
1
Try the Experiments in Google Colab
What Are Many-to-One Sequence Problems?
In many-to-one sequence problems, we have a sequence of data as input, and we have to predict a single output. Sentiment analysis or text classification is one such use case.
An Example Of A Many-to-One LSTM Model In Keras
We have created a toy dataset, as shown in the image. The input has 15 samples with three time steps, and the output is the sum of the values in each step.

Let us train it with a vanilla LSTM. You can see the loss metric for the train and validation data, as shown in the plots.
tf.keras.backend.clear_session()model = Sequential()model.add(LSTM(50, activation='relu', input_shape=(3, 1)))model.add(Dense(1))model.compile(optimizer='adam', loss='mse')wandb.init(entity='ayush-thakur', project='dl-question-bank')history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, callbacks=[WandbCallback()])
When predicting it with test data, the input is a sequence of three time steps: [50,51,52]. The expected output should be the sum of the values, which is 153. The model predicted the value: [[152.9253]], which is insanely close to the expected value.
Run set
1
Try the Experiments in Google Colab
What Are Many-to-Many Sequence Problems?
Many-to-Many sequence learning can be used for machine translation where the input sequence is in some language, and the output sequence is in some other language. It can be used for Video Classification as well, where the input sequence is the feature representation of each frame of the video at different time steps.
Encoder-Decoder network is commonly used for many-to-many sequence tasks. Here encoder-decoder is just a fancy name for a neural architecture with two LSTM layers.
An Example Of A Many-to-Many LSTM Model In Keras
In this toy experiment, we have created a dataset shown in the image below. The input has 20 samples with three time steps each, while the output has the next three consecutive multiples of 5.

Let us train it with a vanilla Encoder-Decoder architecture. You can see the loss metric for the train and validation data, as shown in the plots.
model = Sequential()# encoder layermodel.add(LSTM(100, activation='relu', input_shape=(3, 1)))# repeat vectormodel.add(RepeatVector(3))# decoder layermodel.add(LSTM(100, activation='relu', return_sequences=True))model.add(TimeDistributed(Dense(1)))model.compile(optimizer='adam', loss='mse')print(model.summary())wandb.init(entity='ayush-thakur', project='dl-question-bank')history = model.fit(X, Y, epochs=1000, validation_split=0.2, verbose=1, batch_size=3, callbacks=[WandbCallback()])
When predicting it with test data, the input is a sequence of three time steps: [300, 305, 310]. The expected output should be a sequence of next three consecutive multiples of five, [315, 320, 325]. The model predicted the value: [[[315.29865], [321.0397 ], [327.0003 ]]] which is close to the expected value.
Run set
1
Try Weights & Biases
Weights & Biases helps you keep track of your machine learning experiments. Try our tool to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.
Get started in 5 minutes or run 2 quick experiments on Replit and see how W&B can help organise your work foloow the instructions below:
Instructions:
- Click the green "Run" button below (the first time you click Run, Replit will take approx 30-45 seconds to allocate a machine)
- Follow the prompts in the terminal window (the bottom right pane below)
- You can resize the terminal window (bottom right) for a larger view
Further Reading
These are some of the resources that I found relevant for my own understanding of these concepts.
(Solving Sequence Problems with LSTM in Keras blog post by Usman Malik was used to come up with code snippets.)
Add a comment
Thank for the nice article. How I can put a LSTM layer between two dense layers?
Indeed he output of four dense layer show enter the LSTM layer.
Suppose I have four dense layers as follows, each dense layer is for a specific time. Then these four set of features should enter a LSTM layer with 128 units. Then another dense layer used for classification. I do not know how I should connect dense layers to LSTM layer. The shape of each dense layer is (None, 128). It is a many to many LSTM.
o1= Dense(128, activation = activation, kernel_regularizer='l2')(o1)
o2 = Dense(128, activation = activation, kernel_regularizer='l2')(o2)
o3 = Dense(128, activation = activation, kernel_regularizer='l2')(o3)
o4 = Dense(128, activation = activation, kernel_regularizer='l2')(o4)
lstm = tensorflow.keras.layers.LSTM(128, input_shape=(????))
outputlstm = lstm(????)
output = Dense(2, activation='softmax')(outputlstm)
can you please HELP me?
Reply
loss, val_loss
change the loss and validation loss graph to logarithmic scale 1 reply
Thank you for the post. How is video classification treated as an example of many to many RNN? My understanding is that there are frames of videos as sequence of input and that single classification can occur after all sequence ends. So, it seem to fit 'Many to One' RNN
Reply
Iterate on AI agents and models faster. Try Weights & Biases today.