Describe the bug
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 16672 values, but the requested shape requires a multiple of 66688
To Reproduce
Steps to reproduce the behavior:
Worked fine with 0.9
Log:
INFO:mlagents.trainers:CommandLineOptions(debug=False, num_runs=1, seed=-1, env_path=None, run_id='ian-01', load_model=False, train_model=True, save_freq=50000, keep_checkpoints=5, base_port=5005, num_envs=1, curriculum_folder=None, lesson=0, slow=True, no_graphics=False, multi_gpu=False, trainer_config_path='D:\\GitHub\\unitystation\\UnityProject\\Assets\\Scripts\\NPC\\AI\\Brains\\configs\\move.yaml', sampler_file_path=None, docker_target_name=None, env_args=None)
INFO:mlagents.envs:Start training by pressing the Play button in the Unity Editor.
INFO:mlagents.envs:
'Academy' started successfully!
Unity Academy name: Academy
Number of Brains: 1
Number of Training Brains : 1
Reset Parameters :
Unity brain name: GeneralMoveBrain
Number of Visual Observations (per agent): 0
Vector Observation space size (per agent): 10
Number of stacked Vector Observation: 1
Vector Action space type: discrete
Vector Action space size (per agent): [9]
Vector Action descriptions: MoveDir
INFO:mlagents.envs:Hyperparameters for the PPOTrainer of brain GeneralMoveBrain:
trainer: ppo
batch_size: 32
beta: 0.00455000018700957
buffer_size: 2048
epsilon: 0.216
hidden_units: 512
lambd: 0.9144
learning_rate: 0.000155999994603917
max_steps: Infinity
memory_size: 512
normalize: True
num_epoch: 5
num_layers: 3
time_horizon: 30
sequence_length: 128
summary_freq: 2500
use_recurrent: True
reward_signals:
extrinsic:
strength: 1
gamma: 0.99
summary_path: ./summaries/ian-01_GeneralMoveBrain
model_path: ./models/ian-01-0/GeneralMoveBrain
keep_checkpoints: 5
c:\users\andre\.conda\envs\ml3\lib\site-packages\numpy\core\fromnumeric.py:3257: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
c:\users\andre\.conda\envs\ml3\lib\site-packages\numpy\core\_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1327, in _do_call
return fn(*args)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1312, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1420, in _call_tf_sessionrun
status, run_metadata)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 16672 values, but the requested shape requires a multiple of 66688
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat_1, Reshape/shape)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\andre\.conda\envs\ml3\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\andre\.conda\envs\ml3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\andre\.conda\envs\ml3\Scripts\mlagents-learn.exe\__main__.py", line 9, in <module>
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\learn.py", line 417, in main
run_training(0, run_seed, options, Queue())
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\learn.py", line 255, in run_training
tc.start_learning(env)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 202, in start_learning
n_steps = self.advance(env_manager)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\envs\timers.py", line 263, in wrapped
return func(*args, **kwargs)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\trainer_controller.py", line 292, in advance
trainer.update_policy()
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 251, in update_policy
buffer.make_mini_batch(l, l + batch_size), n_sequences
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\envs\timers.py", line 263, in wrapped
return func(*args, **kwargs)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\policy.py", line 185, in update
update_vals = self._execute_model(feed_dict, self.update_dict)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\tf_policy.py", line 151, in _execute_model
network_out = self.sess.run(list(out_dict.values()), feed_dict=feed_dict)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 905, in run
run_metadata_ptr)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1140, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run
run_metadata)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 16672 values, but the requested shape requires a multiple of 66688
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat_1, Reshape/shape)]]
Caused by op 'Reshape', defined at:
File "c:\users\andre\.conda\envs\ml3\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\andre\.conda\envs\ml3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\andre\.conda\envs\ml3\Scripts\mlagents-learn.exe\__main__.py", line 9, in <module>
sys.exit(main())
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\learn.py", line 417, in main
run_training(0, run_seed, options, Queue())
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\learn.py", line 233, in run_training
options.multi_gpu,
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\trainer_util.py", line 91, in initialize_trainers
multi_gpu,
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\trainer.py", line 75, in __init__
seed, brain, trainer_parameters, self.is_training, load
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\policy.py", line 40, in __init__
brain, trainer_params, reward_signal_configs, is_training, load, seed
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\policy.py", line 91, in create_model
trainer_params.get("vis_encode_type", "simple")
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\models.py", line 55, in __init__
self.create_dc_actor_critic(h_size, num_layers, vis_encode_type)
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\ppo\models.py", line 190, in create_dc_actor_critic
hidden, self.memory_in, self.sequence_length
File "c:\users\andre\.conda\envs\ml3\lib\site-packages\mlagents\trainers\models.py", line 568, in create_recurrent_encoder
lstm_input_state = tf.reshape(input_state, shape=[-1, sequence_length, s_size])
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\gen_array_ops.py", line 6980, in reshape
"Reshape", tensor=tensor, shape=shape, name=name)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 3290, in create_op
op_def=op_def)
File "C:\Users\andre\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 1654, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 16672 values, but the requested shape requires a multiple of 66688
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat_1, Reshape/shape)]]
rolled back to 0.9.3 and have the same error there. I'll go back to 0.9.0 and see if that is that was the working version I was on before.
Ok 0.9.0 is confirmed working. The error isn't there. I have not tried 0.9.1 or 0.9.2 but it is definitely in 0.9.3 and 0.10
Hi @DooblyNoobly, this is because your batch size is not a multiple of and greater than your sequence length. I've logged the bug with internal tracking id MLA-85, and will keep you posted. In 0.9, it would have added a batch of size sequence_length regardless of your batch_size.
In the meantime, if you increase your batch size to larger than the sequence length, it should continue to train.
This bug has been fixed in the develop branch of ML-Agents, and will be fixed in the next release. Thanks for reporting - closing the issue for now.