Version: nightly prebuilt for Python2 w/ GPU (just now)
I'm expecting the following code to print "10.0" 3 times, but session.run got stuck in all forked processes.
import tensorflow as tf
import multiprocessing as mp
import os
class Worker(mp.Process):
def __init__(self, gid):
self.gid = gid
super(Worker, self).__init__()
def run(self):
G = tf.Graph()
with G.as_default():
x = tf.placeholder(tf.float32, shape=[])
y = x * 2
sess = tf.Session()
print sess.run(y, feed_dict={x: 5})
G = tf.Graph()
with G.as_default():
sess = tf.Session()
with sess.as_default():
x = tf.placeholder(tf.float32, shape=[])
y = x * 2
print sess.run(y, feed_dict={x: 5})
procs = [Worker(k) for k in range(2)]
for p in procs: p.start()
for p in procs: p.join()
Removing the graph/session in master process will solve the problem. So it seems like once there is a session, we cannot use fork?
The problem exists with and without GPU.
NOTE: this code doesn't terminate normally. You'll probably need to manually kill the forked processes after the master exited.
The in-process session (i.e. tf.Session()
with no arguments) is not designed to be fork()
-safe. If you want to share a set of devices between multiple processes, create a tf.train.Server
in one process, and create sessions that connect to that server (with tf.Session("grpc://...")
) in the other processes.
@mrry Does it mean there is a way to create fork
safe tf.Session
with tf.Session(args)
?
@mavenlin
The prototype of tf.Session
is
tf.Session.__init__(target='', graph=None, config=None)
Here target
refers to the execution engine to connect to. That is, one has to run the working session in another process, in distributed mode, and tf.Session
with arguments is still not fork()
-safe.
Sad news.
Most helpful comment
@mrry Does it mean there is a way to create
fork
safetf.Session
withtf.Session(args)
?