Hello guys,
When i tried to run train_ppo.py from carla folder on the ray cluster, i get the error as "ModuleNotFoundError: No module named 'env'". But i have the env.py script in the same carla folder. may i know why i am receiving this error. Kindly help
WARNING: Not monitoring node memory since psutil is not installed. Install this with pip install psutil (or ray[debug]) to enable debugging of memory-related crashes.
172.31.47.162:6379
2019-02-12 13:05:05,964 WARNING worker.py:1354 -- WARNING: Not updating worker name since setproctitle is not installed. Install this with pip install setproctitle (or ray[debug]) to enable monitoring of worker processes.
2019-02-12 13:05:05,982 INFO tune.py:135 -- Tip: to resume incomplete experiments, pass resume='prompt' or resume=True to run_experiments()
2019-02-12 13:05:05,982 INFO tune.py:145 -- Starting a new experiment.
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/20 CPUs, 0/5 GPUs
Unknown memory usage. Please run pip install psutil (or ray[debug]) to resolve)
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 17/20 CPUs, 0/5 GPUs
Unknown memory usage. Please run pip install psutil (or ray[debug]) to resolve)
Result logdir: /home/ubuntu/ray_results/carla
RUNNING trials:
2019-02-12 13:05:11,094 ERROR trial_runner.py:413 -- Error processing event.
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 378, in _process_events
result = self.trial_executor.fetch_result(trial)
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 228, in fetch_result
result = ray.get(trial_future[0])
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/worker.py", line 2132, in get
raise value
ray.worker.RayTaskError: ray_worker (pid=1382, host=ip-172-31-34-62)
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/utils.py", line 452, in _wrapper
return orig_attr(args, *kwargs)
File "pyarrow/_plasma.pyx", line 531, in pyarrow._plasma.PlasmaClient.get
File "pyarrow/serialization.pxi", line 448, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 411, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 262, in pyarrow.lib.SerializedPyObject.deserialize
File "pyarrow/serialization.pxi", line 171, in pyarrow.lib.SerializationContext._deserialize_callback
ModuleNotFoundError: No module named 'env'
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/20 CPUs, 0/5 GPUs
Unknown memory usage. Please run pip install psutil (or ray[debug]) to resolve)
Result logdir: /home/ubuntu/ray_results/carla
ERROR trials:
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/20 CPUs, 0/5 GPUs
Unknown memory usage. Please run pip install psutil (or ray[debug]) to resolve)
Result logdir: /home/ubuntu/ray_results/carla
ERROR trials:
Traceback (most recent call last):
File "train_ppo.py", line 56, in
"num_cpus_per_worker":4,
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/tune/tune.py", line 189, in run_experiments
raise TuneError("Trials did not complete", errored_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_CarlaEnv_0])
Killing live carla processes set()
It's probably because the env.py isn't on the path for other nodes in the cluster. You can probably fix that by syncing the directory to other nodes.
@ericl
I'm trying to create the cluster environment for the first time, can you let me know how to "sync the directory to other nodes" in ray cluster.
Well, you can manually copy it, also see https://ray.readthedocs.io/en/latest/autoscaling.html#updating-your-cluster
@ericl
The folders are synched to the head node. but the worker nodes are not able to get access to the files in head node. Do we have to sync folders to the worker node as well?
If not, then how the worker nodes can access the files during training as the carla simulator is present on head node.
The files and folders are present in head node. The below error is got from the worker node,
ray.worker.RayTaskError: ray_worker (pid=1382, host=ip-172-31-34-62)
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/utils.py", line 452, in _wrapper
return orig_attr(args, *kwargs)
File "pyarrow/_plasma.pyx", line 531, in pyarrow._plasma.PlasmaClient.get
File "pyarrow/serialization.pxi", line 448, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 411, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 262, in pyarrow.lib.SerializedPyObject.deserialize
File "pyarrow/serialization.pxi", line 171, in pyarrow.lib.SerializationContext._deserialize_callback
ModuleNotFoundError: No module named 'env'
How the worker node can read the files from maste/ head or connect to the head node.
I can ssh from master to worker node as well.
Kindly help
Most helpful comment
@ericl
The folders are synched to the head node. but the worker nodes are not able to get access to the files in head node. Do we have to sync folders to the worker node as well?
If not, then how the worker nodes can access the files during training as the carla simulator is present on head node.
The files and folders are present in head node. The below error is got from the worker node,
ray.worker.RayTaskError: ray_worker (pid=1382, host=ip-172-31-34-62)
File "/home/ubuntu/.local/lib/python3.6/site-packages/ray/utils.py", line 452, in _wrapper
return orig_attr(args, *kwargs)
File "pyarrow/_plasma.pyx", line 531, in pyarrow._plasma.PlasmaClient.get
File "pyarrow/serialization.pxi", line 448, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 411, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 262, in pyarrow.lib.SerializedPyObject.deserialize
File "pyarrow/serialization.pxi", line 171, in pyarrow.lib.SerializationContext._deserialize_callback
ModuleNotFoundError: No module named 'env'
How the worker node can read the files from maste/ head or connect to the head node.
I can ssh from master to worker node as well.
Kindly help