Here is an example. With "preload" option "data_obj" gets instantiated in parent thread before loading each workers. My goal is to access "data_obj.data" from workers. Using preload option "data_obj" should be shared between workers.
import falcon
import threading
import os
class Data(object):
def __init__(self):
self.data = []
self.set_data()
def set_data(self):
self.data.append(1)
print "In Parent: Process Id: %s, Memory address %s, Data: %s" % (os.getpid(), id(self.data), self.data)
self._schedule_next_run(3)
def _schedule_next_run(self, next_update_time):
model_thread = threading.Timer(next_update_time, self.set_data)
model_thread.daemon = True
model_thread.start()
data_obj = Data()
class Testing(object):
def on_get(self, req, resp, **params):
print "In worker: Process Id: %s, Memory address %s, Data: %s" % (os.getpid(), id(data_obj.data), data_obj.data)
return
api = falcon.API()
api.add_route('/', Testing())
Output example:
$ gunicorn --access-logfile - --capture-output --workers 4 --preload testing:api
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1]
Starting gunicorn 19.6.0
Listening at: http://127.0.0.1:8000 (26294)
Using worker: sync
Booting worker with pid: 26300
Booting worker with pid: 26301
Booting worker with pid: 26302
Booting worker with pid: 26303
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1]
In worker: Process Id: 26300, Memory address 139900785366512, Data: [1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1]
In worker: Process Id: 26303, Memory address 139900785366512, Data: [1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1, 1]
In worker: Process Id: 26302, Memory address 139900785366512, Data: [1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1, 1, 1]
In worker: Process Id: 26301, Memory address 139900785366512, Data: [1]
As we can see in the output, workers can't get the most recent updated result of "obj_data.data" from parent. Each worker should have copy-on-write access on parent memory.
FYI: I am testing in unix system
Once a process has been started, modifications to its memory are only visible in that process. That is, both parent and worker will copy-on-write data_obj.data
. Forking is not the same thing as (bidirectional) shared memory, and copy-on-write is an invisible optimization; the concept is that the memory is copied immediately (or in other words, the child has a point-in-time view of the parent's memory). You cannot change data in a parent process and trivially expect it to be visible to a forked child.
Most helpful comment
Once a process has been started, modifications to its memory are only visible in that process. That is, both parent and worker will copy-on-write
data_obj.data
. Forking is not the same thing as (bidirectional) shared memory, and copy-on-write is an invisible optimization; the concept is that the memory is copied immediately (or in other words, the child has a point-in-time view of the parent's memory). You cannot change data in a parent process and trivially expect it to be visible to a forked child.