Gunicorn: with "preload" option, workers don't get the right shared memory of parent

Created on 11 Sep 2017  路  1Comment  路  Source: benoitc/gunicorn

Here is an example. With "preload" option "data_obj" gets instantiated in parent thread before loading each workers. My goal is to access "data_obj.data" from workers. Using preload option "data_obj" should be shared between workers.

import falcon
import threading
import os

class Data(object):
    def __init__(self):
        self.data = []

        self.set_data()

    def set_data(self):
        self.data.append(1)
        print "In Parent: Process Id: %s, Memory address %s, Data: %s" % (os.getpid(), id(self.data), self.data)
        self._schedule_next_run(3)

    def _schedule_next_run(self, next_update_time):
        model_thread = threading.Timer(next_update_time, self.set_data)
        model_thread.daemon = True
        model_thread.start()

data_obj = Data()

class Testing(object):
    def on_get(self, req, resp, **params):
        print "In worker: Process Id: %s, Memory address %s, Data: %s" % (os.getpid(), id(data_obj.data), data_obj.data)
        return


api = falcon.API()

api.add_route('/', Testing())

Output example:

$ gunicorn --access-logfile - --capture-output --workers 4 --preload  testing:api
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1]
Starting gunicorn 19.6.0
Listening at: http://127.0.0.1:8000 (26294)
Using worker: sync
Booting worker with pid: 26300
Booting worker with pid: 26301
Booting worker with pid: 26302
Booting worker with pid: 26303
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1]
In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1]

In worker: Process Id: 26300, Memory address 139900785366512, Data: [1]

In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1]

In worker: Process Id: 26303, Memory address 139900785366512, Data: [1]

In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1, 1]

In worker: Process Id: 26302, Memory address 139900785366512, Data: [1]

In Parent: Process Id: 26294, Memory address 139900785366512, Data: [1, 1, 1, 1, 1, 1, 1]

In worker: Process Id: 26301, Memory address 139900785366512, Data: [1]

As we can see in the output, workers can't get the most recent updated result of "obj_data.data" from parent. Each worker should have copy-on-write access on parent memory.

FYI: I am testing in unix system

Most helpful comment

Once a process has been started, modifications to its memory are only visible in that process. That is, both parent and worker will copy-on-write data_obj.data. Forking is not the same thing as (bidirectional) shared memory, and copy-on-write is an invisible optimization; the concept is that the memory is copied immediately (or in other words, the child has a point-in-time view of the parent's memory). You cannot change data in a parent process and trivially expect it to be visible to a forked child.

>All comments

Once a process has been started, modifications to its memory are only visible in that process. That is, both parent and worker will copy-on-write data_obj.data. Forking is not the same thing as (bidirectional) shared memory, and copy-on-write is an invisible optimization; the concept is that the memory is copied immediately (or in other words, the child has a point-in-time view of the parent's memory). You cannot change data in a parent process and trivially expect it to be visible to a forked child.

Was this page helpful?
0 / 5 - 0 ratings