By pcko1

2019-03-14 20:22:32 8 Comments

I have 2 Keras submodels (model_1, model_2) out of which I form my full model using keras.models.Model() by stacking them logically in "series". By this I mean that model_2 accepts the output of model_1plus an extra input tensor and the output of model_2is the output of my full model. The full modelis created successfully and I am also able to use compile/train/predict.

However, I want to parallelize the training of modelby running it on 2 GPUs, thus I use multi_gpu_model() which fails with the error:

AssertionError: Could not compute output Tensor("model_2/Dense_Decoder/truediv:0", shape=(?, 33, 22), dtype=float32)

I have tried parallelizing the two submodels individually using multi_gpu_model(model_1, gpus=2) and multi_gpu_model(model_1, gpus=2), yet both succeed. The problem appears only with the full model.

I am using Tensorflow 1.12.0 and Keras 2.2.4. A snippet that demonstrates the problem (at least on my machine) is:

from keras.layers import Input, Dense,TimeDistributed, BatchNormalization
from keras.layers import CuDNNLSTM as LSTM
from keras.models import Model
from keras.utils import multi_gpu_model

dec_layers = 2
codelayer_dim = 11
bn_momentum = 0.9
lstm_dim = 128
td_dense_dim = 0
output_dims = 22
dec_input_shape = [33, 44]

latent_input = Input(shape=(codelayer_dim,), name="Latent_Input")

# Initialize list of state tensors for the decoder
decoder_state_list = []

for dec_layer in range(dec_layers):
    # The tensors for the initial states of the decoder
    name = "Dense_h_" + str(dec_layer)
    h_decoder = Dense(lstm_dim, activation="relu", name=name)(latent_input)

    name = "BN_h_" + str(dec_layer)
    decoder_state_list.append(BatchNormalization(momentum=bn_momentum, name=name)(h_decoder))

    name = "Dense_c_" + str(dec_layer)
    c_decoder = Dense(lstm_dim, activation="relu", name=name)(latent_input)

    name = "BN_c_" + str(dec_layer)
    decoder_state_list.append(BatchNormalization(momentum=bn_momentum, name=name)(c_decoder))

# Define model_1
model_1 = Model(latent_input, decoder_state_list)

inputs = []

decoder_inputs = Input(shape=dec_input_shape, name="Decoder_Inputs")

xo = decoder_inputs

for dec_layer in range(dec_layers):
    name = "Decoder_State_h_" + str(dec_layer)
    state_h = Input(shape=[lstm_dim], name=name)

    name = "Decoder_State_c_" + str(dec_layer)
    state_c = Input(shape=[lstm_dim], name=name)

    # RNN layer
    decoder_lstm = LSTM(lstm_dim,
                   name="Decoder_LSTM_" + str(dec_layer))

    xo = decoder_lstm(xo, initial_state=[state_h, state_c])
    xo = BatchNormalization(momentum=bn_momentum, name="BN_Decoder_" + str(dec_layer))(xo)
    if td_dense_dim > 0: # Squeeze LSTM interconnections using Dense layers
        xo = TimeDistributed(Dense(td_dense_dim), name="Time_Distributed_" + str(dec_layer))(xo)

# Final Dense layer to return probabilities
outputs = Dense(output_dims, activation='softmax', name="Dense_Decoder")(xo)

# Define model_2
model_2 = Model(inputs=inputs, outputs=[outputs])

latent_input = Input(shape=(codelayer_dim,), name="Latent_Input")
decoder_inputs = Input(shape=dec_input_shape, name="Decoder_Inputs")

# Stack the two models
# Propagate tensors through 1st model
x = model_1(latent_input)
# Insert decoder_inputs as the first input of the 2nd model
x.insert(0, decoder_inputs)
# Propagate tensors through 2nd model
x = model_2(x)

# Define full model
model = Model(inputs=[latent_input, decoder_inputs], outputs=[x])

# Parallelize the model
parallel_model = multi_gpu_model(model, gpus=2)

Thanks a lot for any help / tips.


@pcko1 2019-03-15 13:45:53

I found the solution to my problem, which I am not sure how to justify for.

The problem is caused by x.insert(0, decoder_inputs) which I substituted with x = [decoder_inputs] + x. Both seem to result in the same list of tensors, however multi_gpu_model complains in the first case.

@DomJack 2019-04-08 10:19:58

Just a little explanation: x.insert mutates x, while x = [decoder_inputs] + x creates a new list with the same result. I'm surprised keras' Model.__call__ returns a list which isn't safe to mutate - seems a bit sloppy, but understandable. I'm of the opinion that if you don't want people mutating your return values, return a tuple instead.

Related Questions

Sponsored Content

1 Answered Questions

3 Answered Questions

2 Answered Questions

1 Answered Questions

1 Answered Questions

1 Answered Questions

1 Answered Questions

[SOLVED] Make Keras model output a constant of certain shape

  • 2018-07-10 11:07:29
  • ninja
  • 526 View
  • 1 Score
  • 1 Answer
  • Tags:   tensorflow keras

1 Answered Questions

Sponsored Content