Ray: ImportError: Tensorflow 2.0 GPU Recursion Error

Created on 2 Sep 2019  路  5Comments  路  Source: ray-project/ray

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Ray installed from (source or binary): binary
  • Ray version: 0.7.3
  • Python version: 3.7
  • Tensorflow version: tensorflow-gpu 2.0.0rc0
  • Exact command to reproduce:
# Importing packages
from time import time
import gym
import tensorflow as tf
import ray

# Creating our initial model    
model = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])

# Setting parameters
episodes = 64
env_name = 'BipedalWalker-v2'

# Initializing ray
ray.init(num_cpus=8, num_gpus=1)

# Creating our ray function
@ray.remote
def play(weights):
    actor = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])
    actor = actor.set_weights(weights)
    env = gym.make('BipedalWalker-v2').env
    env._max_episode_steps=1e20
    obs = env.reset()
    for _ in range(1200):
        action = actor.predict_classes(obs).flatten()[0]
        action = env.action_space.sample()
        obs, rt, done, info = env.step(action)
    return rt

# Testing ray
start = time()
weights = model.get_weights()
weights = ray.put(weights)
results = ray.get([play.remote(weights) for i in range(episodes)])
ray.shutdown()
print('Ray done after:',time()-start)`

Describe the problem

I am trying to use Ray to parallelize rollouts of OpenAI gym environments using a Tensorflow 2.0-gpu Keras actor. Every time I try to instantiate a Keras model using @ray.remote it raises a recursion depth reached error. I am following the documentation outlined by Ray, where it is suggested to pass weights instead of models. I am not sure what I am doing wrong here, any thoughts?

Source code / logs

File "/home/jacob/anaconda3/envs/tf-2.0-gpu/lib/python3.7/site-packages/tensorflow/__init__.py", line 50, in __getattr__
module = self._load()
File "/home/jacob/anaconda3/envs/tf-2.0-gpu/lib/python3.7/site-packages/tensorflow/__init__.py", line 44, in _load
module = _importlib.import_module(self.__name__)
RecursionError: maximum recursion depth exceeded

Most helpful comment

I think generally just if I've run into pickling errors with a library, importing inside the function works for me. It's hard to give a standard answer mainly because external libraries/packages are not very transparent in how they depend on global state.

All 5 comments

What is the name of your file called, and do you have any folders called "tensorflow" or "tf"?

The name of the script containing this code is raytest.py. I am using a conda environment called "tf-2.0-gpu" (as shown in the traceback above). But there are no subdirectories in the same directory as raytest.py, or anything close to "tf" in the same directory.

Here's a quick fix. Import tensorflow inside the remote function:

# Creating our ray function
@ray.remote
def play(weights):
    import tensorflow as tf
    actor = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])
    actor = actor.set_weights(weights)
    env = gym.make('BipedalWalker-v2').env
    env._max_episode_steps=1e20
    obs = env.reset()
    for _ in range(1200):
        action = actor.predict_classes(obs).flatten()[0]
        action = env.action_space.sample()
        obs, rt, done, info = env.step(action)
    return rt

Awesome, this worked like a charm in addition to a few other fixes:

# Importing packages
from time import time
import numpy as np
import gym
import tensorflow as tf
import ray

# Shutting down ray if needed
try:
    ray.shutdown()
except:
    pass

# Creating our initial model    
model = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])

# Initializing ray
ray.init(num_cpus=8, num_gpus=1)

# Creating our normal function
def play_serial(weights):
    actor = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])
    actor.set_weights(weights)
    env = gym.make('BipedalWalker-v2').env
    env._max_episode_steps=1e20
    obs = env.reset()
    for _ in range(1200):
        action = actor.predict_classes(np.array([obs])).flatten()[0]
        action = env.action_space.sample()
        obs, rt, done, info = env.step(action)
    return rt

# Creating our ray function
@ray.remote
def play(weights):
    import tensorflow as tf
    actor = tf.keras.Sequential([
        tf.keras.layers.Dense(64, input_shape=(24,), activation='relu'),
        tf.keras.layers.Dense(4, activation='softmax')
        ])
    actor.set_weights(weights)
    env = gym.make('BipedalWalker-v2').env
    env._max_episode_steps=1e20
    obs = env.reset()
    for _ in range(1200):
        action = actor.predict_classes(np.array([obs])).flatten()[0]
        action = env.action_space.sample()
        obs, rt, done, info = env.step(action)
    return rt

if __name__ == '__main__':
    # Testing serial
    start = time()
    weights = model.get_weights()
    serial = [play_serial(weights) for i in range(256)]
    serial_time = time()-start
    # Testing ray
    start = time()
    weights = ray.put(weights)
    results = ray.get([play.remote(weights) for i in range(256)])
    ray_time = time()-start
    print('Serial done after:', round(serial_time, 3), 'seconds')
    print('Ray done after:', round(ray_time, 3), 'seconds')
    ray.shutdown()

When do packages need to be imported in the function definition?

I think generally just if I've run into pickling errors with a library, importing inside the function works for me. It's hard to give a standard answer mainly because external libraries/packages are not very transparent in how they depend on global state.

Was this page helpful?
0 / 5 - 0 ratings