When attempting to initialize a run from a previous run, instead of starting at step 0 it starts at wherever the other run left off.
Looking at the Pull Request for this feature, it looks like _set_step function was defined but I don't see it used anywhere.
To Reproduce
Steps to reproduce the behavior:
mlagents-learn config/trainer_config.yaml --initialize-from=first3DBallRun --run-id==second3DBallRun
2020-05-12 20:04:55.558358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From c:\users\teelr\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
โโโโโโโ
โโโโโโโโโโโโโ
,โโโmโโโ' ,โโโโโโโ โโโ โโโ
โโโโโ' โโโโ โโโ โโ โโ ,โโ โโโโ ,โโ โโโโโ โโโ ,โโ
โโโโโ โโโโ โโโโ โโโ โโโ โโโโโโโโโโ โโโ โโโโโ ^โโโ โโโโ
โโโโโโโโโโโโโโโโโ โโ โโโ โโโ โโโ โโโ โโโ โโโ โโโโ โโโ
โโโโโโโโโโโโโโโโโโ โโ โโโ โโโ โโโ โโโ โโโ โโโ โโโโโโ
^โโโโ โโโโ โโโโ โโโโโโโโโ โโโ โโโ โโโ โโโโ โโโโ`
'โโโโโ ^โโโ โโโ โโโโโ โโ ^โโ `โโ `โโ 'โโ โโโโ
โโโโโโโโ โโโโโโ, โโโโโ
`โโโโโโโโโโโโ
ยฌ`โโโโโ
Version information:
ml-agents: 0.16.0,
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
TensorFlow: 2.1.0
2020-05-12 20:04:59.737490: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From c:\users\teelr\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-05-12 20:05:01 INFO [environment.py:201] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
2020-05-12 20:05:06 INFO [environment.py:111] Connected to Unity environment with package version 1.0.0-preview and communication version 1.0.0
2020-05-12 20:05:06 INFO [environment.py:342] Connected new brain:
3DBall?team=0
2020-05-12 20:05:06.290407: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-12 20:05:06.307036: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-12 20:05:06.361342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 with Max-Q Design computeCapability: 7.5
coreClock: 1.095GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2020-05-12 20:05:06.373009: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-12 20:05:06.442669: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:06.480843: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-12 20:05:06.504039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-12 20:05:06.563529: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-12 20:05:06.603801: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-12 20:05:06.699862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-12 20:05:06.704810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-12 20:05:10.748173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 20:05:10.753119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-12 20:05:10.756694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-12 20:05:10.762259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6269 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-05-12 20:05:10 INFO [stats.py:130] Hyperparameters for behavior name =second3DBallRun_3DBall:
trainer: ppo
batch_size: 64
beta: 0.001
buffer_size: 12000
epsilon: 0.2
hidden_units: 128
lambd: 0.99
learning_rate: 0.0003
learning_rate_schedule: linear
max_steps: 5.0e5
memory_size: 128
normalize: True
num_epoch: 3
num_layers: 2
time_horizon: 1000
sequence_length: 64
summary_freq: 12000
use_recurrent: False
vis_encode_type: simple
reward_signals:
extrinsic:
strength: 1.0
gamma: 0.99
summary_path: =second3DBallRun_3DBall
model_path: ./models/=second3DBallRun/3DBall
init_path: ./models/first3DBallRun/3DBall
keep_checkpoints: 5
2020-05-12 20:05:10.893625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 with Max-Q Design computeCapability: 7.5
coreClock: 1.095GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 357.69GiB/s
2020-05-12 20:05:10.904673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-12 20:05:10.908116: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:10.914349: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-12 20:05:10.917813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-12 20:05:10.923354: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-12 20:05:10.927416: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-12 20:05:10.935354: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-12 20:05:10.938944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-12 20:05:10.944470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 20:05:10.947842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-12 20:05:10.949961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-12 20:05:10.954546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6269 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-05-12 20:05:11 INFO [tf_policy.py:118] Loading model for brain 3DBall?team=0 from ./models/first3DBallRun/3DBall.
2020-05-12 20:05:11 INFO [tf_policy.py:142] Starting training from step 0 and saving to ./models/=second3DBallRun/3DBall.
2020-05-12 20:05:11.787400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-12 20:05:22 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 276000. Time Elapsed: 22.734 s Mean Reward: 100.000. Std of Reward: 0.000. Training.
2020-05-12 20:05:32 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 288000. Time Elapsed: 33.533 s Mean Reward: 92.523. Std of Reward: 25.901. Training.
2020-05-12 20:05:43 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 300000. Time Elapsed: 43.901 s Mean Reward: 84.307. Std of Reward: 33.978. Training.
2020-05-12 20:05:55 INFO [stats.py:111] =second3DBallRun_3DBall: Step: 312000. Time Elapsed: 56.405 s Mean Reward: 100.000. Std of Reward: 0.000. Training.
2020-05-12 20:05:59 INFO [subprocess_env_manager.py:191] UnityEnvironment worker 0: environment stopping.
2020-05-12 20:05:59 INFO [trainer_controller.py:116] Learning was interrupted. Please wait while the graph is generated.
2020-05-12 20:06:00 INFO [trainer_controller.py:112] Saved Model
2020-05-12 20:06:00 INFO [model_serialization.py:221] List of nodes to export for brain :3DBall?team=0
2020-05-12 20:06:00 INFO [model_serialization.py:223] is_continuous_control
2020-05-12 20:06:00 INFO [model_serialization.py:223] version_number
2020-05-12 20:06:00 INFO [model_serialization.py:223] memory_size
2020-05-12 20:06:00 INFO [model_serialization.py:223] action_output_shape
2020-05-12 20:06:00 INFO [model_serialization.py:223] action
2020-05-12 20:06:00 INFO [model_serialization.py:223] action_probs
Converting ./models/=second3DBallRun/3DBall/frozen_graph_def.pb to ./models/=second3DBallRun/3DBall.nn
IGNORED: Cast unknown layer
IGNORED: Shape unknown layer
IGNORED: StopGradient unknown layer
GLOBALS: 'is_continuous_control', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 8] => 'sub_2'
OUT: 'action', 'action_probs'
DONE: wrote ./models/=second3DBallRun/3DBall.nn file.
2020-05-12 20:06:00 INFO [model_serialization.py:76] Exported ./models/=second3DBallRun/3DBall.nn file
Environment (please complete the following information):
NOTE: We are unable to help reproduce bugs with custom environments. Please attempt to reproduce your issue with one of the example environments, or provide a minimal patch to one of the environments needed to reproduce the issue.
it must be saving data to somewhere, check cfg, txt files and player prefs
You re welcome :)
Hi @teelrc
This is indeed a bug. We will be addressing it shortly in a hotfix. Thanks for brining it to our attention.
Happy to help!
following
@kahabal This should be fixed in the 0.16.1 version of the mlagents python package (see https://github.com/Unity-Technologies/ml-agents/releases/tag/release_2). Are you still seeing this problem?
@chriselion sorry, it was solved, i had a problem of versioning and for some reasons i was not able to use the updated version of the package. thank you!!!
thanks :)
On Tue, 23 Jun 2020 at 00:38, kahabal notifications@github.com wrote:
@chriselion https://github.com/chriselion sorry, it was solved, i had a
problem of versioning and for some reasons i was not able to use the
updated version of the package. thank you!!!โ
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/Unity-Technologies/ml-agents/issues/3956#issuecomment-647782419,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AI75KO6VFE2UOFX3SF43G63RX7FLRANCNFSM4M7LUHFA
.