By nidomo

2019-03-07 13:57:38 8 Comments

I trained a Many-to-Many sequence model in Keras with return_sequences=True and TimeDistributed wrapper on the last Dense layer:

model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=50))
model.add(LSTM(100, return_sequences=True))
model.add(TimeDistributed(Dense(vocab_size, activation='softmax')))
# train...

enter image description here

So during the training the loss is calculated over all hidden states (in every timestamp). But for inference I only need the get output on the last timestamp. So I load the weights into Many-to-One sequence model for inference without TimeDistributed wrapper and I set return_sequences=False to get only last output of the LSTM layer:

inference_model = Sequential()
inference_model.add(Embedding(input_dim=vocab_size, output_dim=50))
inference_model.add(LSTM(100, return_sequences=False))
inference_model.add(Dense(vocab_size, activation='softmax'))


When I test my inference model on a sequence with length 20 I expect to get a prediction with shape (vocab_size) but inference_model.predict(...) still returns predictions for every timestamp - a tensor of shape (20, vocab_size)


@today 2019-03-07 14:31:38

If, for whatever reason, you need only the last timestep during inference, you can build a new model which applies the trained model on the input and returns the last timestep as its output using the Lambda layer:

from keras.models import Model
from keras.layers import Input, Lambda

inp = Input(shape=put_the_input_shape_here)
x = model(inp) # apply trained model on the input
out = Lambda(lambda x: x[:,-1])(x)

inference_model = Model(inp, out)

Side Note: As already stated in this answer, TimeDistributed(Dense(...)) and Dense(...) are equivalent, since Dense layer is applied on the last dimension of its input Tensor. Hence, that's why you get the same output shape.

@nidomo 2019-03-07 15:22:47

Oh. Is there a way to apply TimeDistributed(Dense(...)) to every timestamp of LSTM output?

@today 2019-03-07 17:12:16

@nidomo Well, I am not sure what you mean exactly as it is already applied on all the timesteps.

Related Questions

Sponsored Content

12 Answered Questions

[SOLVED] Getting the last element of a list

  • 2009-05-30 19:28:53
  • Janusz
  • 2019015 View
  • 1994 Score
  • 12 Answer
  • Tags:   python list indexing

36 Answered Questions

[SOLVED] How to get the current time in Python

  • 2009-01-06 04:54:23
  • user46646
  • 3222762 View
  • 2785 Score
  • 36 Answer
  • Tags:   python datetime time

7 Answered Questions

[SOLVED] How do I get the number of elements in a list?

  • 2009-11-11 00:30:54
  • y2k
  • 3226585 View
  • 1908 Score
  • 7 Answer
  • Tags:   python list

1 Answered Questions

[SOLVED] KERAS: Get a SLICE of RNN timesteps with return_sequence = True

11 Answered Questions

[SOLVED] How do I get a substring of a string in Python?

1 Answered Questions

1 Answered Questions

2 Answered Questions

2 Answered Questions

1 Answered Questions

[SOLVED] How to do softmax when LSTM returns sequence in keras?

Sponsored Content