Skip to main content

How to stack multiple LSTMs in keras?

Created on August 9|Last edited on August 9

The Problem

When you try to stack multiple LSTMs in Keras like so –

model = Sequential()
model.add(LSTM(100,input_shape =(time_steps,vector_size)))
model.add(LSTM(100))

Keras throws the followring exception Exception: Input 0 is incompatible with layer lstm_28: expected ndim=3, found ndim=2

The Solution

The solution is to add return_sequences=True to all LSTM layers except the last one so that its output tensor has ndim=3 (i.e. batch size, timesteps, hidden state).

Setting this flag to true lets Keras know that LSTM output should contain all historical generated outputs along with time stamps (3D). So, next LSTM layer can work further on the data.

If this flag is false, then LSTM only returns last output (2D). Such output is not good enough for another LSTM layer.

Please see the following example:

# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
               input_shape=(timesteps, data_dim)))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True))  # returns a sequence of vectors of dimension 32
model.add(LSTM(32))  # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
From:  (search for "stacked lstm")

The last Dense layer is added to get output in format needed by the user. Here Dense(10) means 10 different classes output will be generated using softmax activation.

In case you are using LSTM for time series then you should have Dense(1). So that only one numeric output is given.

image.png

For more details, check out the Keras' guide on training sequential models.