Stable-baselines: How to directly rebuild the model from saved zip file?

Created on 10 Mar 2020 · 11Comments · Source: hill-a/stable-baselines

Currently I want to rebuild the stable baseline model in keras and pytorch. It seems that the model cannot directly save to a h5 file. Thus I want to rebuild from the zip file save in this code.

log_dir = "test_save/"
model_dqn.save(log_dir + "pong")

This file contains the model structure inparameter_list, and weights inparameters. However, I cannot rebuild the model from these two files with the code below:
```from keras.models import model_from_json

load json and create model

json_file = open('parameter_list', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)

loaded_model.load_weights("parameters")
```

No tech support duplicate question

Source

QuXinghuaNTU

All 11 comments

The error is:
```---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
4 loaded_model_json = json_file.read()
5 json_file.close()
----> 6 loaded_model = model_from_json(loaded_model_json)
7
8 loaded_model.load_weights("parameters")

~/anaconda3/envs/distillation/lib/python3.7/site-packages/keras/engine/saving.py in model_from_json(json_string, custom_objects)
490 config = json.loads(json_string)
491 from ..layers import deserialize
--> 492 return deserialize(config, custom_objects=custom_objects)
493
494

~/anaconda3/envs/distillation/lib/python3.7/site-packages/keras/layers/__init__.py in deserialize(config, custom_objects)
53 module_objects=globs,
54 custom_objects=custom_objects,
---> 55 printable_module_name='layer')

~/anaconda3/envs/distillation/lib/python3.7/site-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
167 else:
168 raise ValueError('Could not interpret serialized ' +
--> 169 printable_module_name + ': ' + identifier)
170
171

TypeError: can only concatenate str (not "list") to str```

QuXinghuaNTU on 10 Mar 2020

Seems like a duplicate of #699

araffin on 10 Mar 2020

Just to add here: The model files do not store the architecture of the network (this is only available as a pickled Python function, which creates a TF graph). You have to manually build a corresponding network, load the parameters from the file and assign them correctly.

Note to self: For v3, I should look into a possible common format for saving the model architecture as well, if in any way possible (in a convenient way).

Miffyli on 10 Mar 2020

Note to self: For v3, I should look into a possible common format for saving the model architecture as well, if in any way possible (in a convenient way).

ONNX export ? or tracing using pytorch jit?

araffin on 10 Mar 2020

👍1

Just to add here: The model files do not store the architecture of the network (this is only available as a pickled Python function, which creates a TF graph). You have to manually build a corresponding network, load the parameters from the file and assign them correctly.

Note to self: For v3, I should look into a possible common format for saving the model architecture as well, if in any way possible (in a convenient way).

Thanks, Maffyli. I tried to build a model in Keras and load weights for this model. However, it still did not work. Could you please kindly have a brief look at the error and share your expert opinion?
The code is below:
```from keras import optimizers
from keras import backend as K
from keras.losses import kullback_leibler_divergence,mean_squared_error
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropout, Activation, BatchNormalization
method = 'dqn'
game = 'Pong'
with open("trained_agents/{}/{}NoFrameskip-v4.pkl".format(method,game), 'rb') as file:
pickle_model = pickle.load(file)
weights = pickle_model[-1]

model1 = Sequential()
model1.add(Conv2D(32, (8, 8), strides=(4, 4), input_shape=(84, 84, 4)))
model1.add(Activation('relu'))
model1.add(Conv2D(64, (4, 4), strides=(2, 2)))
model1.add(Activation('relu'))
model1.add(Conv2D(64, (3, 3), strides=(1, 1)))
model1.add(Activation('relu'))
model1.add(Flatten())
model1.add(Dense(512))
model1.add(Activation('relu'))
model1.add(Dense(6))

model1.set_weights(weights)```

The error is:
```---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1863 try:
-> 1864 c_op = c_api.TF_FinishOperation(op_desc)
1865 except errors.InvalidArgumentError as e:

InvalidArgumentError: Shapes must be equal rank, but are 4 and 0 for 'Assign' (op: 'Assign') with input shapes: [8,8,4,32], [].

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
23 model1.add(Dense(6))
24
---> 25 model1.set_weights(weights)

~/anaconda3/envs/distillation/lib/python3.7/site-packages/keras/engine/network.py in set_weights(self, weights)
506 tuples.append((sw, w))
507 weights = weights[num_param:]
--> 508 K.batch_set_value(tuples)
509
510 @property

~/anaconda3/envs/distillation/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in batch_set_value(tuples)
2463 assign_placeholder = tf.placeholder(tf_dtype,
2464 shape=value.shape)
-> 2465 assign_op = x.assign(assign_placeholder)
2466 x._assign_placeholder = assign_placeholder
2467 x._assign_op = assign_op

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/ops/variables.py in assign(self, value, use_locking, name, read_value)
1950 """
1951 assign = state_ops.assign(self._variable, value, use_locking=use_locking,
-> 1952 name=name)
1953 if read_value:
1954 return assign

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/ops/state_ops.py in assign(ref, value, validate_shape, use_locking, name)
225 return gen_state_ops.assign(
226 ref, value, use_locking=use_locking, name=name,
--> 227 validate_shape=validate_shape)
228 return ref.assign(value, name=name)
229

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/ops/gen_state_ops.py in assign(ref, value, validate_shape, use_locking, name)
64 _, _, _op = _op_def_lib._apply_op_helper(
65 "Assign", ref=ref, value=value, validate_shape=validate_shape,
---> 66 use_locking=use_locking, name=name)
67 _result = _op.outputs[:]
68 _inputs_flat = _op.inputs

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
786 op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
787 input_types=input_types, attrs=attr_protos,
--> 788 op_def=op_def)
789 return output_structure, op_def.is_stateful, op
790

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py in new_func(args, *kwargs)
505 'in a future version' if date is None else ('after %s' % date),
506 instructions)
--> 507 return func(args, *kwargs)
508
509 doc = _add_deprecated_arg_notice_to_docstring(

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in create_op(failed resolving arguments)
3614 input_types=input_types,
3615 original_op=self._default_original_op,
-> 3616 op_def=op_def)
3617 self._create_op_helper(ret, compute_device=compute_device)
3618 return ret

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in __init__(self, node_def, g, inputs, output_types, control_inputs, input_types, original_op, op_def)
2025 op_def, inputs, node_def.attr)
2026 self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
-> 2027 control_input_ops)
2028
2029 # Initialize self._outputs.

~/anaconda3/envs/distillation/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1865 except errors.InvalidArgumentError as e:
1866 # Convert to ValueError for backwards compatibility.
-> 1867 raise ValueError(str(e))
1868
1869 return c_op

ValueError: Shapes must be equal rank, but are 4 and 0 for 'Assign' (op: 'Assign') with input shapes: [8,8,4,32], [].```

QuXinghuaNTU on 11 Mar 2020

Another way would be: extracting the dqn model from the build TF graph and save it independently. I did not check whether it is possible. In that case, we may need to change the source code a lot.

QuXinghuaNTU on 11 Mar 2020

Please take a look at the documentation on exporting models. You have to get the parameters as numpy arrays with get_parameters, and map these to correct Keras layers. I had a mapping like this for Keras model to stable-baselines. The key tells stable-baselines parameter name, and value is a lambda function that retrieves corresponding layer from a Keras model.

# Mapping from stable-baselines agent variable
# name to a function that takes in keras model
# and assigns right variables
CNN_PARAMS = {
    "model/cnn1/w:0": lambda model: model.get_layer("conv2d_1").get_weights()[0],
    "model/cnn1/b:0": lambda model: model.get_layer("conv2d_1").get_weights()[1][None, :, None, None],
    "model/cnn2/w:0": lambda model: model.get_layer("conv2d_2").get_weights()[0],
    "model/cnn2/b:0": lambda model: model.get_layer("conv2d_2").get_weights()[1][None, :, None, None],
    "model/cnn3/w:0": lambda model: model.get_layer("conv2d_3").get_weights()[0],
    "model/cnn3/b:0": lambda model: model.get_layer("conv2d_3").get_weights()[1][None, :, None, None],
    "model/cnn_fc1/w:0": lambda model: model.get_layer("dense_1").get_weights()[0],
    "model/cnn_fc1/b:0": lambda model: model.get_layer("dense_1").get_weights()[1]
}

Another way would be: extracting the dqn model from the build TF graph and save it independently.

This is planned for v3 with PyTorch backend.

Miffyli on 11 Mar 2020

Please take a look at the documentation on exporting models. You have to get the parameters as numpy arrays with get_parameters, and map these to correct Keras layers. I had a mapping like this for Keras model to stable-baselines. The key tells stable-baselines parameter name, and value is a lambda function that retrieves corresponding layer from a Keras model.
# Mapping from stable-baselines agent variable
# name to a function that takes in keras model
# and assigns right variables
CNN_PARAMS = {
    "model/cnn1/w:0": lambda model: model.get_layer("conv2d_1").get_weights()[0],
    "model/cnn1/b:0": lambda model: model.get_layer("conv2d_1").get_weights()[1][None, :, None, None],
    "model/cnn2/w:0": lambda model: model.get_layer("conv2d_2").get_weights()[0],
    "model/cnn2/b:0": lambda model: model.get_layer("conv2d_2").get_weights()[1][None, :, None, None],
    "model/cnn3/w:0": lambda model: model.get_layer("conv2d_3").get_weights()[0],
    "model/cnn3/b:0": lambda model: model.get_layer("conv2d_3").get_weights()[1][None, :, None, None],
    "model/cnn_fc1/w:0": lambda model: model.get_layer("dense_1").get_weights()[0],
    "model/cnn_fc1/b:0": lambda model: model.get_layer("dense_1").get_weights()[1]
}
Another way would be: extracting the dqn model from the build TF graph and save it independently.

This is planned for v3 with PyTorch backend.

Thanks for your reply. This mapping looks like ignore the last fully connected layer (i.e., from 512 hidden nodes to the action outputs). It would be soooo nice if you can share one example for loading from pkl file and then converting weights to a keras model. In that way, I think people can also save models to other formats (e.g., .h5 that is mentioned by ChunJyeBehBeh). However, this is based on your team has such an extension plan. Thanks so much for your help anyway:)

QuXinghuaNTU on 12 Mar 2020

👍1

Please take a look at the documentation on exporting models. You have to get the parameters as numpy arrays with get_parameters, and map these to correct Keras layers. I had a mapping like this for Keras model to stable-baselines. The key tells stable-baselines parameter name, and value is a lambda function that retrieves corresponding layer from a Keras model.
# Mapping from stable-baselines agent variable
# name to a function that takes in keras model
# and assigns right variables
CNN_PARAMS = {
    "model/cnn1/w:0": lambda model: model.get_layer("conv2d_1").get_weights()[0],
    "model/cnn1/b:0": lambda model: model.get_layer("conv2d_1").get_weights()[1][None, :, None, None],
    "model/cnn2/w:0": lambda model: model.get_layer("conv2d_2").get_weights()[0],
    "model/cnn2/b:0": lambda model: model.get_layer("conv2d_2").get_weights()[1][None, :, None, None],
    "model/cnn3/w:0": lambda model: model.get_layer("conv2d_3").get_weights()[0],
    "model/cnn3/b:0": lambda model: model.get_layer("conv2d_3").get_weights()[1][None, :, None, None],
    "model/cnn_fc1/w:0": lambda model: model.get_layer("dense_1").get_weights()[0],
    "model/cnn_fc1/b:0": lambda model: model.get_layer("dense_1").get_weights()[1]
}
Another way would be: extracting the dqn model from the build TF graph and save it independently.

This is planned for v3 with PyTorch backend.

Hi Miffyli, I have built a keras model and load the weights from the pkl file. The code for doing so is shown as:

## Load the existing pkl models (trained by stable baselines)
import pickle
method = 'dqn'
game   = 'Pong'
with open("trained_agents/{}/{}NoFrameskip-v4.pkl".format(method,game), 'rb') as file:
    pickle_model = pickle.load(file)
pkl_weights = pickle_model[-1]

#### Build keras model
from keras import optimizers
from keras import backend as K
from keras.losses import kullback_leibler_divergence,mean_squared_error
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropout, Activation, BatchNormalization
import numpy as np

keras_model = Sequential()
keras_model.add(Conv2D(32, (8, 8), strides=(4, 4), input_shape=(84, 84, 4)))
keras_model.add(Activation('relu'))
keras_model.add(Conv2D(64, (4, 4), strides=(2, 2)))
keras_model.add(Activation('relu'))
keras_model.add(Conv2D(64, (3, 3), strides=(1, 1)))
keras_model.add(Activation('relu'))
keras_model.add(Flatten())
keras_model.add(Dense(512))
keras_model.add(Activation('relu'))
keras_model.add(Dense(6))

### transfer the weights from pkl model to keras model
j_list = [0,2,4,7,9]  
for i in range(5):
    w1 = np.squeeze(pkl_weights[2*i+1])
    w2 = np.squeeze(pkl_weights[2*i+2])
    layer_weights = [w1, w2]
    keras_model.layers[j_list[i]].set_weights(layer_weights)
### Test the performace of the saved keras model
from stable_baselines.common.cmd_util import make_atari_env
from stable_baselines.common.policies import CnnPolicy
from stable_baselines.common.vec_env import VecFrameStack
env = make_atari_env('{}NoFrameskip-v4'.format(game), num_env=1, seed=0)
env = VecFrameStack(env, n_stack=4)
obs = env.reset()
while True:
    actions = keras_model.predict(obs)
    action = np.argmax(actions[0])
    obs, rewards, dones, infos = env.step([action])
    episode_infos = infos[0].get('episode')
    if episode_infos is not None:
        print("Atari Episode Score: {:.2f}".format(episode_infos['r']))
        print("Atari Episode Length", episode_infos['l'])
        break

The model weights can be converted, but the performance indicate that the there is still some issues. The accumulated reward of such converted keras model is -20.
Could you please share your insight on this?

Atari Episode Score: -20.00
Atari Episode Length 896

QuXinghuaNTU on 12 Mar 2020

It seems your forgot to normalize the image... (cf doc) by dividing by 255

araffin on 12 Mar 2020

It seems your forgot to normalize the image... (cf doc) by dividing by 255

Thanks. It finally works. Will update the code here.

QuXinghuaNTU on 12 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

RDPG implementation ?

H2SO4T · 3Comments

How can i get the parameters of the trained policy

HareshKarnan · 3Comments

Tensorboard add summary image

maystroh · 3Comments

[question] What does .action_probability mean for continuous spaces?

shwang · 3Comments

SubprocVecEnv problem

maystroh · 3Comments