By Tor


2016-11-08 20:43:45 8 Comments

I'd like to reset (randomize) the weights of all layers in my Keras (deep learning) model. The reason is that I want to be able to train the model several times with different data splits without having to do the (slow) model recompilation every time.

Inspired by this discussion, I'm trying the following code:

# Reset weights
for layer in KModel.layers:
    if hasattr(layer,'init'):
        input_dim = layer.input_shape[1]
        new_weights = layer.init((input_dim, layer.output_dim),name='{}_W'.format(layer.name))
        layer.trainable_weights[0].set_value(new_weights.get_value())

However, it only partly works.

Partly, becuase I've inspected some layer.get_weights() values, and they seem to change. But when I restart the training, the cost values are much lower than the initial cost values on the first run. It's almost like I've succeeded resetting some of the weights, but not all of them.

Any tips on where I'm going wrong would be deeply appreciated. Thx..

6 comments

@Andrew - OpenGeoCode 2019-09-06 17:17:52

To "random" re-initialize weights of a compiled untrained model in TF 2.0 (tf.keras):

weights = [glorot_uniform(seed=random.randint(0, 1000))(w.shape) if w.ndim > 1 else w for w in model.get_weights()]

Note the "if wdim > 1 else w". You don't want to re-initialize the biases (they stay 0 or 1).

@BallpointBen 2018-05-09 15:48:23

If you want to truly re-randomize the weights, and not merely restore the initial weights, you can do the following. The code is slightly different depending on whether you're using TensorFlow or Theano.

from keras.initializers import glorot_uniform  # Or your initializer of choice
import keras.backend as K

initial_weights = model.get_weights()

backend_name = K.backend()
if backend_name == 'tensorflow': 
    k_eval = lambda placeholder: placeholder.eval(session=K.get_session())
elif backend_name == 'theano': 
    k_eval = lambda placeholder: placeholder.eval()
else: 
    raise ValueError("Unsupported backend")

new_weights = [k_eval(glorot_uniform()(w.shape)) for w in initial_weights]

model.set_weights(new_weights)

@guillefix 2018-12-21 22:47:48

Nice and simple solution!

@Bersan 2018-12-31 20:16:56

Cannot evaluate tensor using `eval()`: No default session is registered.

@BallpointBen 2018-12-31 20:55:21

@Bersan See my edit

@Ashot Matevosyan 2018-09-12 10:00:52

K.get_session().close()
K.set_session(tf.Session())
K.get_session().run(tf.global_variables_initializer())

@bendl 2018-12-14 14:18:48

Not quite as portable but works well for tensorflow backend!

@Mendi Barel 2018-08-07 13:08:49

Reset all layers by checking for initializers:

def reset_weights(model):
    session = K.get_session()
    for layer in model.layers: 
        if hasattr(layer, 'kernel_initializer'):
            layer.kernel_initializer.run(session=session)
        if hasattr(layer, 'bias_initializer'):
            layer.bias_initializer.run(session=session)     

@SuperNES 2019-02-15 19:22:02

This is the best approach in my view.

@Xiaohong Deng 2019-03-28 22:01:40

Is it outdated? Now kernel_initializer has no attribute run. In my case kernel_initializer is a VarianceScaling object

@tkchris 2019-07-15 20:05:38

@XiaohongDeng try kernel.initializer.run(session=session) instead. I had the same problem

@ezchx 2017-05-13 20:45:09

Save the initial weights right after compiling the model but before training it:

model.save_weights('model.h5')

and then after training, "reset" the model by reloading the initial weights:

model.load_weights('model.h5')

This gives you an apples to apples model to compare different data sets and should be quicker than recompiling the entire model.

@Tor 2017-05-15 13:06:23

I ended up doing something similar. Saving to disk and loading takes a lot of time, so I just keep the weights in a variable: weights = model.get_weights() I get the initial weights like this before running the first training. Then, before each subsequentt training, I reload the initial weights and run jkleint's shuffle method, as mentioned in the link that I posted. Seems to work smoothly..

@BallpointBen 2018-05-07 21:01:33

For the full code snippet of @Tor's suggestion: weights = model.get_weights(), model.compile(args), model.fit(args), model.set_weights(weights)

@Andrew 2019-07-05 14:15:49

Based on this, I've started making a lambda function when I initialize my model. I build the model, then do something like weights = model.get_weights(); reset_model = lambda model: model.set_weights(weights), that way I can just call reset_model(model) later.

@maz 2017-03-08 05:15:07

Try set_weights.

for example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
import numpy as np
np.random.seed(1234)
from keras.layers import Input
from keras.layers.convolutional import Convolution2D
from keras.models import Model

print("Building Model...")
inp = Input(shape=(1,None,None))
x   = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)

w = np.asarray([ 
    [[[
    [0,0,0],
    [0,2,0],
    [0,0,0]
    ]]]
    ])

for layer_i in range(len(model_network.layers)):
    print (model_network.layers[layer_i])

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w)



input_mat = np.asarray([ 
    [[
    [1.,2.,3.,10.],
    [4.,5.,6.,11.],
    [7.,8.,9.,12.]
    ]]
    ])

print("Input:")
print(input_mat)
print("Output:")
print(model_network.predict(input_mat))

w2 = np.asarray([ 
    [[[
    [0,0,0],
    [0,3,0],
    [0,0,0]
    ]]]
    ])


for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w2)

print("Output:")
print(model_network.predict(input_mat))

build a model with say, two convolutional layers

print("Building Model...")
inp = Input(shape=(1,None,None))
x   = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(inp)
output = Convolution2D(1, 3, 3, border_mode='same', init='normal',bias=False)(x)
model_network = Model(input=inp, output=output)

then define your weights (i'm using a simple w, but you could use np.random.uniform or anything like that if you want)

w = np.asarray([ 
    [[[
    [0,0,0],
    [0,2,0],
    [0,0,0]
    ]]]
    ])

Take a peek at what are the layers inside a model

for layer_i in range(len(model_network.layers)):
    print (model_network.layers[layer_i])

Set each weight for each convolutional layer (you'll see that the first layer is actually input and you don't want to change that, that's why the range starts from 1 not zero).

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w)

Generate some input for your test and predict the output from your model

input_mat = np.asarray([ 
    [[
    [1.,2.,3.,10.],
    [4.,5.,6.,11.],
    [7.,8.,9.,12.]
    ]]
    ])

print("Output:")
print(model_network.predict(input_mat))

You could change it again if you want and check again for the output:

w2 = np.asarray([ 
    [[[
    [0,0,0],
    [0,3,0],
    [0,0,0]
    ]]]
    ])

for layer_i in range(1,len(model_network.layers)):
    model_network.layers[layer_i].set_weights(w2)

print("Output:")
print(model_network.predict(input_mat))

Sample output:

Using Theano backend.
Building Model...
<keras.engine.topology.InputLayer object at 0x7fc0c619fd50>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6166250>
<keras.layers.convolutional.Convolution2D object at 0x7fc0c6150a10>
Weights after change:
[array([[[[ 0.,  0.,  0.],
         [ 0.,  2.,  0.],
         [ 0.,  0.,  0.]]]], dtype=float32)]
Input:
[[[[  1.   2.   3.  10.]
   [  4.   5.   6.  11.]
   [  7.   8.   9.  12.]]]]
Output:
[[[[  4.   8.  12.  40.]
   [ 16.  20.  24.  44.]
   [ 28.  32.  36.  48.]]]]
Output:
[[[[   9.   18.   27.   90.]
   [  36.   45.   54.   99.]
   [  63.   72.   81.  108.]]]]

From your peek at .layers you can see that the first layer is input and the others your convolutional layers.

Related Questions

Sponsored Content

3 Answered Questions

3 Answered Questions

[SOLVED] Understanding Keras LSTMs

1 Answered Questions

[SOLVED] Keras layers unfreezing causes training process to start from scratch

1 Answered Questions

[SOLVED] OR-Lambda-Layer operation with Keras

1 Answered Questions

1 Answered Questions

[SOLVED] Set weights by name for a keras TimeDistributed layer

1 Answered Questions

Copying weights from one Conv2D layer to another

1 Answered Questions

[SOLVED] The initialization of model training in keras

  • 2018-03-08 21:01:31
  • RayZ
  • 883 View
  • 1 Score
  • 1 Answer
  • Tags:   keras

1 Answered Questions

0 Answered Questions

Keras pass data through layers explicitly

Sponsored Content