hi , I have a problem. i set worker_class to 'gevent' and used multiprocessing like follows in flask application.
p = Process(target=train_thread, args=(v, parameters))
p.name = v
p.daemon = True
p.start()
when the worker's subprocess runs for a while, the subprocess be killed. the logfile print follows:
[2020-04-27 16:15:59 +0800] [20185] [INFO] Starting gunicorn 20.0.4
[2020-04-27 16:15:59 +0800] [20185] [INFO] pidfile /root/PycharmProjects/textclassficationcnn/data/log/gunicorn.pid
[2020-04-27 16:15:59 +0800] [20185] [DEBUG] Arbiter booted
[2020-04-27 16:15:59 +0800] [20185] [INFO] Listening at: http://0.0.0.0:9099 (20185)
[2020-04-27 16:15:59 +0800] [20185] [INFO] Using worker: gevent
[2020-04-27 16:15:59 +0800] [20188] [INFO] Booting worker with pid: 20188
[2020-04-27 16:15:59 +0800] [20185] [DEBUG] 1 workers
[2020-04-27 16:16:19 +0800] [20188] [DEBUG] POST /train
[2020-04-27 16:28:51 +0800] [20852] [INFO] self.ppid: 20185, os.getppid():20188
[2020-04-27 16:28:51 +0800] [20852] [INFO] Parent changed, shutting down: <Worker 20188>
[2020-04-27 16:28:51 +0800] [20852] [INFO] Worker exiting (pid: 20188)
i think the master process pid is 20185, the worker process pid is 20188 ,and the subprocess pid is 20852. why the subprocess call 'notify' function, and eventually killed.
if i change the worker_class to 'sync',this problem will be solved. But I wonder if there are other solutions. i still want to use gevent worker.
This is my first github issue, thanks for help
When you fork processes under gevent, all the existing greenlets continue to exist in the child process (unlike threads would). Eventually, if your train_thread
function yields to the gevent loop, then the worker's gunicorn greenlet will run again (it will wake up from gevent.sleep
). It will notice that the parent isn't what it thought it was (it's the worker instead of the master), and it will exit.
I can think of three possibilities.
The first is to be sure your train_thread
function never yields to gevent. That means using no gevent blocking APIs, including those that are monkey-patched in the standard library.
The second is to change the implementation of multiprocessing
to not use fork but something like POSIX spawn with multiprocessing.set_start_method('spawn')
, which would cause a fresh, non-monkey-patched, no-existing-greenlets process to be created for the Process
, instead of copying the state of the current process. However, that means that child watchers don't work for Process
; that might be ok in this fire-and-forget scenario but I'm not sure.
The third would be to have train_thread
call gevent.get_hub().destroy(destroy_loop=True)
as its first action. That will probably prevent the copied gunicorn greenlets from running again.
Thanks a lot !@jamadden
I change the implementation of multiprocessing to spawn with multiprocessing.set_start_method('spawn')
and it works.
Thank you for your detail analysis.It has encouraged me and taught me a lot!
Most helpful comment
When you fork processes under gevent, all the existing greenlets continue to exist in the child process (unlike threads would). Eventually, if your
train_thread
function yields to the gevent loop, then the worker's gunicorn greenlet will run again (it will wake up fromgevent.sleep
). It will notice that the parent isn't what it thought it was (it's the worker instead of the master), and it will exit.I can think of three possibilities.
The first is to be sure your
train_thread
function never yields to gevent. That means using no gevent blocking APIs, including those that are monkey-patched in the standard library.The second is to change the implementation of
multiprocessing
to not use fork but something like POSIX spawn withmultiprocessing.set_start_method('spawn')
, which would cause a fresh, non-monkey-patched, no-existing-greenlets process to be created for theProcess
, instead of copying the state of the current process. However, that means that child watchers don't work forProcess
; that might be ok in this fire-and-forget scenario but I'm not sure.The third would be to have
train_thread
callgevent.get_hub().destroy(destroy_loop=True)
as its first action. That will probably prevent the copied gunicorn greenlets from running again.