Pyinstaller: multiprocessing spawn in python 3 does not work on macOS

Created on 16 Dec 2016  ·  85Comments  ·  Source: pyinstaller/pyinstaller

I'm not able to get multiprocessing to work in spawn mode on macOS when using pyinstaller. The following script runs fine in the interpreter, but when using pyinstaller it enters an infinite recursive loop that keeps starting new processes.

I'm using Python 3 from Anaconda (Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:52:12) ), on macOS 10.12.2, with the development version of pyinstaller (3.3.dev0+483c819).

import sys
from multiprocessing import freeze_support, set_start_method, Process

def test():
    print('In subprocess')

if __name__ == '__main__':
    print(sys.argv,)
    freeze_support()
    set_start_method('spawn')

    print('In main')
    proc = Process(target=test)
    proc.start()
    proc.join()
OS X

All 85 comments

The multithreading tests are currently failing on linux for everything except Python34. My guess is that this is somehow related. I would help but there's not much that I can do to debug this because I'm on windows.

I can confirm the same using homebrew installed python3.6 and the current (11f2e7b) branch of pyinstaller .

This is a blocker for us. Is there anything we can do to help this issue get resolved?

below the output of print(sys.argv,)

['./dist/Pupil Capture.app/Contents/MacOS/pupil_capture']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(7)']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '--multiprocessing-fork', 'tracker_fd=8', 'pipe_handle=57']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(8)']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '--multiprocessing-fork', 'tracker_fd=9', 'pipe_handle=63']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(9)']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(7)']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '--multiprocessing-fork', 'tracker_fd=10', 'pipe_handle=66']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '--multiprocessing-fork', 'tracker_fd=9', 'pipe_handle=63']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(8)']
['/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/dist/Pupil Capture.app/Contents/MacOS/pupil_capture', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(7)']

We probably should get the output from the debug bootloaders. This is what we're looking for.

Edit: see instructions below.

No need to rename the bootloader. Just run PyInstalller with --debug -- see the fine manual.

I did. However, those recipes only apply to Windows, not macOS.

On Feb 9, 2017, at 4:40 PM, Hartmut Goebel notifications@github.com wrote:

@simongus https://github.com/simongus Did you follow https://github.com/pyinstaller/pyinstaller/wiki/Recipe-Multiprocessing https://github.com/pyinstaller/pyinstaller/wiki/Recipe-Multiprocessing?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/pyinstaller/pyinstaller/issues/2322#issuecomment-278782438, or mute the thread https://github.com/notifications/unsubscribe-auth/AIYzH9ficPXEsaLMjnOKimeyonp-ZCPgks5ra4ewgaJpZM4LPVu5.

Hi,

here the output from running a bundle with --debug: https://gist.github.com/mkassner/e56c17e7c1c65e8a9d03ef411f7499b3

let me know what else I can supply.

Actually not what I suspected. Let me see what this should look like...

Should be this.

I m not sure I follow. Did I not provide the correct output or is my fist indicative of something going wrong.

To reiterate if I use start method: fork it works. But I need to use spawn. This triggers an endless loop.

I m not sure I follow. Did I not provide the correct output or is my fist indicative of something going wrong.

No. The information that I provided was for my own reference.

It appears that my suspicions about MEIPASS2 may have been correct. Try putting this before all other code:

import os
import sys

# Module multiprocessing is organized differently in Python 3.4+
# Python 3.4+
import multiprocessing.popen_spawn_posix as forking

# First define a modified version of Popen.
class _Popen(forking.Popen):
    def __init__(self, *args, **kw):
        if hasattr(sys, 'frozen'):
            # We have to set original _MEIPASS2 value from sys._MEIPASS
            # to get --onefile mode working.
            os.putenv('_MEIPASS2', sys._MEIPASS)
        try:
            super(_Popen, self).__init__(*args, **kw)
        finally:
            if hasattr(sys, 'frozen'):
                # On some platforms (e.g. AIX) 'os.unsetenv()' is not
                # available. In those cases we cannot delete the variable
                # but only set it to the empty string. The bootloader
                # can handle this case.
                if hasattr(os, 'unsetenv'):
                    os.unsetenv('_MEIPASS2')
                else:
                    os.putenv('_MEIPASS2', '')

# Second override 'Popen' class with our modified version.
forking.Popen = _Popen

I added what you asked for and can confirm that the modified Popen is being used but it did not solve the endless startup loop problem.

Here to log: https://gist.github.com/mkassner/029472f96b800ba0e42acfb1ff7208fe

Keeping what you have, place this before all other code:

import sys

def get_command_line(**kwds):
    print('DEBUG: get_command_line called')
    '''
    Returns prefix of command line used for spawning a child process
    '''
    return ([sys.executable, '--multiprocessing-fork'] +
            ['%s=%r' % item for item in kwds.items()])

import multiprocessing.spawn
multiprocessing.spawn.get_command_line = get_command_line

If this doesn't work, then I'll probably write a test for this and then I can probe it on travis...

still not working I m afraid. Here the log:

https://gist.github.com/mkassner/f76baf6b252c47dca08452628f56c825

this gist does not match your small example about: The is zmq mentioned, while your example does not use it. Please test with your minimal example to avoid other problems interfering with this one.

sorry about that.

I have rerun the correct script see below:

import sys, os

def get_command_line(**kwds):
    print('DEBUG: get_command_line called')
    '''
    Returns prefix of command line used for spawning a child process
    '''
    return ([sys.executable, '--multiprocessing-fork'] +
            ['%s=%r' % item for item in kwds.items()])

import multiprocessing.spawn
multiprocessing.spawn.get_command_line = get_command_line


import os
import sys

# Module multiprocessing is organized differently in Python 3.4+
# Python 3.4+
import multiprocessing.popen_spawn_posix as forking

# First define a modified version of Popen.
class _Popen(forking.Popen):
    def __init__(self, *args, **kw):
        if hasattr(sys, 'frozen'):
            # We have to set original _MEIPASS2 value from sys._MEIPASS
            # to get --onefile mode working.
            os.putenv('_MEIPASS2', sys._MEIPASS)
        try:
            super(_Popen, self).__init__(*args, **kw)
        finally:
            if hasattr(sys, 'frozen'):
                # On some platforms (e.g. AIX) 'os.unsetenv()' is not
                # available. In those cases we cannot delete the variable
                # but only set it to the empty string. The bootloader
                # can handle this case.
                if hasattr(os, 'unsetenv'):
                    os.unsetenv('_MEIPASS2')
                else:
                    os.putenv('_MEIPASS2', '')

# Second override 'Popen' class with our modified version.
forking.Popen = _Popen

import sys
from multiprocessing import freeze_support, set_start_method, Process

def test():
    print('In subprocess')

if __name__ == '__main__':
    print(sys.argv,)
    freeze_support()
    set_start_method('spawn')

    print('In main')
    proc = Process(target=test)
    proc.start()
    proc.join()

the problem behaviour is the same.

the log output is here: https://gist.github.com/mkassner/7d1dda745ab33180510a98af34d7862f

Please let me know if there is anything else I can do to help with this issue.

pyinstaller is a great tool and we use it for deployment, unfortunately we need to use spawn on MacOS. We used billiard and python2 in the past successfully with pyinstaller. Now with the migration of the project to python3 we are no longer able to bundle for mac...

If you give me pointers I m happy to do more tests/investigations!

@mkassner If you're wondering why I haven't asked for additional information, it's because this is a harder problem and I need to figure out what might be happening before I proceed.

@xoviat fully understood. I know what you mean! We would be happy to put a bounty on this if we can expedite the process! (But I dont know if this is something the pyinstaller team is ok with...)

Alright: I've put something together that will (finally, hopefully) capture enough output to understand what is happening. As always, simply put this before all else and then post the output:

import os
import sys
from contextlib import suppress

pid = os.getpid()

def trace_calls(frame, event, arg):
    if event != 'call':
        return
    co = frame.f_code
    func_name = co.co_name
    if func_name == 'write':
        # Ignore write() calls from print statements
        return
    elif event == 'return':
        print('[{}] {} => {}'.format(pid, func_name, arg))
        return
    line_no = frame.f_lineno
    filename = co.co_filename

    if 'multiprocessing' not in filename:
        return

    #with suppress(Exception):
    print('[{}] Call to {} on line {} of {}'.format(pid, func_name, line_no, filename))

    for i in range(frame.f_code.co_argcount):
    #    with suppress(Exception):
        name = frame.f_code.co_varnames[i]
        if name in frame.f_locals:
            value = frame.f_locals[name]
        elif name in frame.f_globals:
            value = frame.f_globals[name]
        else:
            value = ''

        if type(value) is not str:
            with suppress(Exception):
                value = repr(value)

        if type(value) is not str:
            with suppress(Exception):
                value = str(value)

        if type(value) is not str:
            value = ''

        print("[{}]    Argument {} is {}".format(pid, name, value))


sys.settrace(trace_calls)

I bundled and ran the below code:

import os
import sys
from contextlib import suppress

pid = os.getpid()

def trace_calls(frame, event, arg):
    if event != 'call':
        return
    co = frame.f_code
    func_name = co.co_name
    if func_name == 'write':
        # Ignore write() calls from print statements
        return
    elif event == 'return':
        print('[{}] {} => {}'.format(pid, func_name, arg))
        return
    line_no = frame.f_lineno
    filename = co.co_filename

    if 'multiprocessing' not in filename:
        return

    #with suppress(Exception):
    print('[{}] Call to {} on line {} of {}'.format(pid, func_name, line_no, filename))

    for i in range(frame.f_code.co_argcount):
    #    with suppress(Exception):
        name = frame.f_code.co_varnames[i]
        if name in frame.f_locals:
            value = frame.f_locals[name]
        elif name in frame.f_globals:
            value = frame.f_globals[name]
        else:
            value = ''

        if type(value) is not str:
            with suppress(Exception):
                value = repr(value)

        if type(value) is not str:
            with suppress(Exception):
                value = str(value)

        if type(value) is not str:
            value = ''

        print("[{}]    Argument {} is {}".format(pid, name, value))


sys.settrace(trace_calls)




import sys, os

def get_command_line(**kwds):
    print('DEBUG: get_command_line called')
    '''
    Returns prefix of command line used for spawning a child process
    '''
    return ([sys.executable, '--multiprocessing-fork'] +
            ['%s=%r' % item for item in kwds.items()])

import multiprocessing.spawn
multiprocessing.spawn.get_command_line = get_command_line


import os
import sys

# Module multiprocessing is organized differently in Python 3.4+
# Python 3.4+
import multiprocessing.popen_spawn_posix as forking

# First define a modified version of Popen.
class _Popen(forking.Popen):
    def __init__(self, *args, **kw):
        if hasattr(sys, 'frozen'):
            # We have to set original _MEIPASS2 value from sys._MEIPASS
            # to get --onefile mode working.
            os.putenv('_MEIPASS2', sys._MEIPASS)
        try:
            super(_Popen, self).__init__(*args, **kw)
        finally:
            if hasattr(sys, 'frozen'):
                # On some platforms (e.g. AIX) 'os.unsetenv()' is not
                # available. In those cases we cannot delete the variable
                # but only set it to the empty string. The bootloader
                # can handle this case.
                if hasattr(os, 'unsetenv'):
                    os.unsetenv('_MEIPASS2')
                else:
                    os.putenv('_MEIPASS2', '')

# Second override 'Popen' class with our modified version.
forking.Popen = _Popen

import sys
from multiprocessing import freeze_support, set_start_method, Process

def test():
    print('In subprocess')

if __name__ == '__main__':
    print(sys.argv,)
    freeze_support()
    set_start_method('spawn')

    print('In main')
    proc = Process(target=test)
    proc.start()
    proc.join()

This is the output: https://gist.github.com/mkassner/b4cd2403dece04d926181f593736b5ce

I hope it gives insight!

Alright, let's try this:

import os
import sys
import traceback
from contextlib import suppress

pid = os.getpid()

def trace_calls(frame, event, arg):
    if event != 'call':
        return
    co = frame.f_code
    func_name = co.co_name
    if func_name == 'write':
        # Ignore write() calls from print statements
        return
    elif event == 'return':
        print('[{}] {} => {}'.format(pid, func_name, arg))
        return
    line_no = frame.f_lineno
    filename = co.co_filename

    c_frame = frame
    c_filename = filename
    c_line_no = line_no

    with suppress(Exception):
        for i in range(10):
            if not (c_line_no == line_no and c_filename == filename):
                break

            c_frame = frame.f_back
            c_line_no = c_frame.f_lineno
            c_filename = c_frame.f_code.co_filename
        else:
            c_line_no = 'UNKNOWN'
            c_filename = 'UNKNOWN'

    if 'multiprocessing' not in filename:
        return

    #with suppress(Exception):
    print('[{}] Call to {} on line {} of {} from {} of {}'.format(pid, func_name, line_no, filename, c_line_no, c_filename))

    for i in range(frame.f_code.co_argcount):
    #    with suppress(Exception):
        name = frame.f_code.co_varnames[i]
        if name in frame.f_locals:
            value = frame.f_locals[name]
        elif name in frame.f_globals:
            value = frame.f_globals[name]
        else:
            value = ''

        if type(value) is not str:
            with suppress(Exception):
                value = repr(value)

        if type(value) is not str:
            with suppress(Exception):
                value = str(value)

        if type(value) is not str:
            value = ''

        print("[{}]    Argument {} is {}".format(pid, name, value))


sys.settrace(trace_calls)

import sys, os

def get_command_line(**kwds):
    print('DEBUG: get_command_line called')
    '''
    Returns prefix of command line used for spawning a child process
    '''
    return ([sys.executable, '--multiprocessing-fork'] +
            ['%s=%r' % item for item in kwds.items()])

from multiprocessing.spawn import os, process, reduction, prepare

def _main(fd):
    with os.fdopen(fd, 'rb', closefd=True) as from_parent:
        process.current_process()._inheriting = True
        try:
            preparation_data = reduction.pickle.load(from_parent)
            prepare(preparation_data)
            self = reduction.pickle.load(from_parent)
        except Exception as e:
            print(traceback.format_exc())
        finally:
            del process.current_process()._inheriting
    return self._bootstrap()

import multiprocessing.spawn
multiprocessing.spawn.get_command_line = get_command_line
multiprocessing.spawn._main = _main

import os
import sys

# Module multiprocessing is organized differently in Python 3.4+
# Python 3.4+
import multiprocessing.popen_spawn_posix as forking

# First define a modified version of Popen.
class _Popen(forking.Popen):
    def __init__(self, *args, **kw):
        if hasattr(sys, 'frozen'):
            # We have to set original _MEIPASS2 value from sys._MEIPASS
            # to get --onefile mode working.
            os.putenv('_MEIPASS2', sys._MEIPASS)
        try:
            super(_Popen, self).__init__(*args, **kw)
        finally:
            if hasattr(sys, 'frozen'):
                # On some platforms (e.g. AIX) 'os.unsetenv()' is not
                # available. In those cases we cannot delete the variable
                # but only set it to the empty string. The bootloader
                # can handle this case.
                if hasattr(os, 'unsetenv'):
                    os.unsetenv('_MEIPASS2')
                else:
                    os.putenv('_MEIPASS2', '')

# Second override 'Popen' class with our modified version.
forking.Popen = _Popen

import sys
from multiprocessing import freeze_support, set_start_method, Process

def test():
    print('In subprocess')

if __name__ == '__main__':
    print(sys.argv,)
    freeze_support()
    set_start_method('spawn')

    print('In main')
    proc = Process(target=test)
    proc.start()
    proc.join()

What I am specifically suspecting is that there is an exception raised in _main.

I appear to have a preliminary solution here / here. All that remains is to boil it down and then include it in a runtime hook.

@xoviat I tired your solution briefly but still saw the same failure. Now it appears that your test repo is gone. Any ideas?

I see now that you just renamed the repo. I just ran the script as bundle again and I can see that we go into the subprocess but I m stuck in a loop. So I m seeing endless prints of in subprocess and in main . Here the log: https://gist.github.com/mkassner/a7a41d5abcd49258fe2c5bcea9324673

So I m seeing endless prints of in subprocess and in main.

Would you mind providing the spec file?

. So I m seeing endless prints of in subprocess and in main

That doesn't happen on travis; I assume because it's not a bundle.

Im using python3.6 installed via homebrew
this is what I m running on your script (which I named myapp.py)

pyinstaller myapp.py
./dist/myapp/myapp

below the spec file myapp.spec:

# -*- mode: python -*-

block_cipher = None


a = Analysis(['myapp.py'],
             pathex=['/Users/mkassner/Pupil'],
             binaries=[],
             datas=[],
             hiddenimports=[],
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher)
pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)
exe = EXE(pyz,
          a.scripts,
          exclude_binaries=True,
          name='myapp',
          debug=False,
          strip=False,
          upx=True,
          console=True )
coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               name='myapp')

@xoviat - any further insight/update on this issue?

@htgoebel A bit of help please. Here is the test that I am using in tests/functional/test_runtime.py:

from pyinstaller.utils.tests import skipif_notosx

@skipif_notosx
def test_issue_2322(pyi_builder, capfd):
    pyi_builder.test_source(
        """
        from multiprocessing import set_start_method, Process
        from multiprocessing.spawn import freeze_support
        def test():
            print('In subprocess')
        if __name__ == '__main__':
            freeze_support()
            set_start_method('spawn')
            print('In main')
            proc = Process(target=test)
            proc.start()
            proc.join()
        """)

    out, err = capfd.readouterr()
    assert "In main\nIn subprocess" in out

And here is the collection failure:

=========================== short test summary info =========================== ERROR tests/functional/test_runtime.py =================================== ERRORS ==================================== ______________ ERROR collecting tests/functional/test_runtime.py ______________ tests\functional\test_runtime.py:13: in <module> from pyinstaller.utils.tests import skipif_notosx pyinstaller.py:15: in <module> run() PyInstaller\__main__.py:70: in run args = parser.parse_args(pyi_args) c:\python36-x64\lib\argparse.py:1733: in parse_args self.error(msg % ' '.join(argv)) c:\python36-x64\lib\argparse.py:2389: in error self.exit(2, _('%(prog)s: error: %(message)s\n') % args) c:\python36-x64\lib\argparse.py:2376: in exit _sys.exit(status) E SystemExit: 2 ------------------------------- Captured stderr ------------------------------- usage: py.test [-h] [-v] [-D] [-F] [--specpath DIR] [-n NAME] [--add-data <SRC;DEST or SRC:DEST>] [--add-binary <SRC;DEST or SRC:DEST>] [-p DIR] [--hidden-import MODULENAME] [--additional-hooks-dir HOOKSPATH] [--runtime-hook RUNTIME_HOOKS] [--exclude-module EXCLUDES] [--key KEY] [-d] [-s] [--noupx] [-c] [-w] [-i <FILE.ico or FILE.exe,ID or FILE.icns>] [--version-file FILE] [-m <FILE or XML>] [-r RESOURCE] [--uac-admin] [--uac-uiaccess] [--win-private-assemblies] [--win-no-prefer-redirects] [--osx-bundle-identifier BUNDLE_IDENTIFIER] [--distpath DIR] [--workpath WORKPATH] [-y] [--upx-dir UPX_DIR] [-a] [--clean] [--log-level LEVEL] scriptname [scriptname ...] py.test: error: unrecognized arguments: --maxfail --durations=10 --junitxml=junit-results.xml tests/unit tests/functional -k not tests/functional/test_libraries.py

What am I doing wrong here?

@xoviat Looks like pytest does not like the arguments (but could be more precise). Did you pip install -r tests/requirements-tools.txt? Or maybe the "not …" needs to be quoted.

Failure occurs from appveyor job and the decorator was copied.

Update: I'm an idiot. Python is case sensitive.

I m still getting the same error? Any idea why?

@mkassner Unfortunately it looks a few tests fail on python3.6 see travisci

@willpatera See #2428.

@mkassner Currently waiting on #2508 to be merged, so that I can validate the test.

@mkassner You can see the test that I am using on the associated PR. If you are still having issues, then you will need to submit a new PR with a test case that will fail on travis mac. You can checkout my 'osx' branch to get mac testing enabled, as it will be merged.

@xoviat I have just installed your osx branch, and tested with following simple myapp.py example from @mkassner's comment above - with python3.6 installed on macOS v10.12.3

import sys
from multiprocessing import freeze_support, set_start_method, Process

def test():
    print('In subprocess')

if __name__ == '__main__':
    print(sys.argv,)
    freeze_support()
    set_start_method('spawn')

    print('In main')
    proc = Process(target=test)
    proc.start()
    proc.join()

And created the app with:

pyinstaller myapp.py

When trying to execute the app I get the same error as @mkassner posted

@willpatera When you run the test, does it fail?

Hi @xoviat I ran the py.test from your branch with the following results

No, the command is:

py.test tests/functional/test_runtime.py::test_issue_2322

Hi @xoviat I just ran the test above and the test passed with the following output:

platform darwin -- Python 3.6.0, pytest-3.0.7, py-1.4.33, pluggy-0.4.0 -- /Users/wrp/.virtualenvs/pyinstaller/bin/python3.6
cachedir: .cache
rootdir: /Users/wrp/Desktop/Sandbox/pyinstaller_test/pyinstaller, inifile: setup.cfg
collected 2 items 

tests/functional/test_regression.py::test_issue_2492 PASSED

@willpatera I revised my comment shortly after posting it.

@xoviat

py.test tests/functional/test_runtime.py::test_issue_2322

results in

platform darwin -- Python 3.6.0, pytest-3.0.7, py-1.4.33, pluggy-0.4.0 -- /Users/wrp/.virtualenvs/pyinstaller/bin/python3.6
cachedir: .cache
rootdir: /Users/wrp/Desktop/Sandbox/pyinstaller_test/pyinstaller, inifile: setup.cfg
collecting 2 items
=================================== no tests ran in 0.01 seconds ===================================
ERROR: not found: /Users/wrp/Desktop/Sandbox/pyinstaller_test/pyinstaller/tests/functional/test_runtime.py::test_issue_2322
(no name '/Users/wrp/Desktop/Sandbox/pyinstaller_test/pyinstaller/tests/functional/test_runtime.py::test_issue_2322' in any of [<Module 'tests/functional/test_runtime.py'>])

@xoviat test_issue_2322 passes on rth_mac_spawn branch (note test_issue_2322 does not exist on the osx branch).

When using the rth_mac_spawn branch with the my_app.py example from above there is an infinite loop.

@willpatera The test that I am using appears to be almost identical to the example you gave:

@skipif_notosx
 @xfail_py2
 def test_issue_2322(pyi_builder, capfd):
     pyi_builder.test_source(
         """
         from multiprocessing import set_start_method, Process
         from multiprocessing.spawn import freeze_support
         from multiprocessing.util import log_to_stderr

         def test():
             print('In subprocess')

         if __name__ == '__main__':
             log_to_stderr()
             freeze_support()
             set_start_method('spawn')

             print('In main')
             proc = Process(target=test)
             proc.start()
             proc.join()
         """)

     out, err = capfd.readouterr()

     # Print the captured output and error so that it will show up in the test output.
     sys.stderr.write(err)
     sys.stdout.write(out)

     expected = ["In main", "In subprocess"]

     assert "\n".join(expected) in out
     for substring in expected:
         assert out.count(substring) == 1

I'm sorry, but I have no other way of knowing whether the problem is fixed. You need to write some kind of test that demonstrates the problem on travis osx so that we can all be on the same page or figure out why my test passes but your program fails. Note that this supposedly creates a program in a tempdir that you should be able to run (use process monitor equivalent on mac to see where the program is created).

note test_issue_2322 does not exist on the osx branch

That's because the osx branch has nothing to do with this problem. If you look at the pull request, it's about testing on the main repository rather than the pyinstaller_osx_tests repository. This is #2505.

Oh, maybe it's that you have:

from multiprocessing import freeze_support, set_start_method, Process

And I have:

from multiprocessing.spawn import freeze_support

I could alias that in the runtime hook

@xoviat - Thanks for this update your import statement works 👍

from multiprocessing.spawn import freeze_support

@htgoebel What do you think about overriding the set_start_method with a custom function that calls the correct freeze_support?

@xoviat Worth a discussion..

Alright, this should be repaired with the update. The new method is

         from multiprocessing import set_start_method, Process, freeze_support

         def test():
             print('In subprocess')

         if __name__ == '__main__':
             freeze_support()
             set_start_method('spawn')

             print('In main')
             proc = Process(target=test)
             proc.start()
             proc.join()

@xoviat I just tried your current banch https://github.com/xoviat/pyinstaller/commits/rth_mac_spawn in one of the commits from Apr 5, 2017 there is a regression.

The error im getting in the boot-stap phase is something like this:

File "main.py", line 251, in <module>
  File "multiprocessing/spawn.py", line 74, in freeze_support
  File "multiprocessing/spawn.py", line 105, in spawn_main
  File "multiprocessing/spawn.py", line 114, in _main
  File "multiprocessing/spawn.py", line 225, in prepare
  File "multiprocessing/spawn.py", line 277, in _fixup_main_from_path
  File "runpy.py", line 261, in run_path
  File "runpy.py", line 231, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/Users/mkassner/Pupil/pupil_code/deployment/deploy_capture/main.py'
Failed to execute script main

When I checkout c586d17f1a87f31d080a15b375bf76345dc29bd0 on your branch the error is gone.

Thanks. This is on hold until #2508 is merged.

This is also an issue with spawn mode on Linux (and other POSIX systems, I assume.) I can verify that #2505 fixes my issue when I modify the runtime hook to run on all POSIX oses.

Additionally - this runtime hook incorporates most (all?) of the strategy used in the Multiprocessing Recipe. I have yet to verify that this also supports spawn mode on Windows; I will do so shortly. If so, I could imagine modifying the hook to run on all the OSes using spawn mode, checking to import the proper version of forking but otherwise unmodified.

Another data point: this breaks semaphores on Linux, and likely on OSX as well (I will check when I get home and have a Mac to test with.) The test case in #2505 works fine, but the following throws an error:

from multiprocessing import set_start_method, Process
from multiprocessing import freeze_support
from multiprocessing.util import log_to_stderr
import multiprocessing as mp

def test(s):
    s.acquire()
    print('In subprocess')
    s.release()

if __name__ == '__main__':
    log_to_stderr()
    freeze_support()
    set_start_method('spawn')

    print('In main')
    s = mp.Semaphore()
    s.acquire()
    proc = Process(target=test, args = [s])
    proc.start()
    s.release()
    proc.join()
brian@tanet:~/src/cytoflow/mp_test$ dist/test_semaphore/test_semaphore 
In main
In subprocess
Traceback (most recent call last):
  File "multiprocessing/util.py", line 252, in _run_finalizers
  File "multiprocessing/util.py", line 185, in __call__
  File "multiprocessing/synchronize.py", line 89, in _cleanup
  File "multiprocessing/semaphore_tracker.py", line 73, in unregister
  File "multiprocessing/semaphore_tracker.py", line 82, in _send
BrokenPipeError: [Errno 32] Broken pipe

It seems like the SemaphoreTracker process isn't around, which is where the BrokenPipeError comes from.

So I've been poking at this all evening, and it turns out the semaphore tracker process is actually the culprit behind the original issue.

Background
When in spawn mode on a POSIX OS (Mac, Linux), Python 3.4+ uses a separate process, the semaphore tracker, to keep track of all the named semaphores that the program creates. This is because named semaphores are a finite resource and if they aren't freed by the program, they wont' be freed until the next reboot. So we don't want to leak them.

Underlying issue
Unfortunately, Python is stupid about how it creates this background process: it basically spawns another copy of the executable with -c from multiprocessing.semaphore_tracker import main;main(%d) on the command line. (The %d is interpolated with the file descriptor of the pipe over which the process receives the names of all of the named semaphores. The call also includes some other flags, which is what @xoviat's proposed fixed was looking for.) This is fine when the executable is the Python interpreter, but in a frozen process it creates another copy of the frozen process.

Usually in spawn mode, newly spawned processes have --multiprocessing-fork on the command line as well as a handle to a pipe over which the receive the picked Process that they are to run. This magic gets caught by the call to freeze_support() which should be at the beginning of __main__. Unfortunately, the code that spawns the semaphore tracker doesn't use this magic. Thus the infinite loop of new processes.

Proposed fix
I can think of two. The first is to override freeze_support in POSIX platforms to look for the particular content of the -c flag that we know the spawn code will use to try to start the semaphore tracker, and start it ourselves.

The second is to override SemaphoreTracker.ensure_running, which is the function that actually starts the tracker, to use the --multiprocessing-fork magic (on POSIX OSes in spawn or forkserver mode). These approaches may not be 100% mutually exclusive.

I've been going at this all afternoon and evening. This is a blocker for me too; hopefully I'll get back to it this weekend or next week. @xoviat, you are also more than welcome to have a go.

I can confirm the semaphore tracker issue on MacOS as well.

@bpteague were you able to devise a solution on these two issues? We use pyinstaller for deployment but Im not familiar enough with the internals to fix the issue.

@mkassner no, I'm still working on it. With a good handle on the underlying issue, a fix shouldn't be too far away...

(another note: PyInstaller is kind of a bystander here. the real issue is with the Python multiprocessing library: spawn mode is broken in all frozen executables on POSIX OSes.)

I think I've fixed the hook. I haven't tested it yet except in the short example above. I will test it this evening in my larger program on more platforms (mac, windows, linux).

Here's a gist:
https://gist.github.com/bpteague/bc574e4d92c9d9904ceba895fb694fc8

Edit: hold off on using this gist. Updates coming soon.

I've updated the gist and confirmed that it works for me on Linux and MacOS. I have yet to try it with fork and forkserver mode. YMMV.

I can also confirm that the gist above works for me on Windows. I still haven't tried fork or forkserver mode on POSIX OSes.

I can confirm that this rhook works on both MacOS and Linux. python 3.6.

This saved my so much hassle Thank you @bpteague .
@xoviat I recommend using this fix!

@bpteague Can you submit a PR?

I will close my PR on this issue as soon as a replacement is submitted.

2505, which says it closes this issue, is merged.

I just ran across this issue myself, it looks like #2759 did not fix the issue for all cases.

OS: MacOS Mojave (v10.14.1)
Python: Anaconda 3.7.1 (packaged by conda-forge, also happens with 3.6)
PyInstaller: v3.4

My program only uses multiprocessing through scikit-learn, it does not use it directly.

My main.py function:

from multiprocessing import set_start_method, Process
from multiprocessing import freeze_support
import my_package  # this package uses `scikit-learn`
import gooey

gui = gooey.Gooey()(my_package.main)

if __name__ == '__main__':
    freeze_support()
    set_start_method('fork')
    proc = Process(target=gui)
    proc.start()
    proc.join()

And my main.spec file:

# -*- mode: python -*-
import gooey
import os
gooey_root = os.path.dirname(gooey.__file__)
gooey_languages = Tree(os.path.join(gooey_root, 'languages'), prefix = 'gooey/languages')
gooey_images = Tree(os.path.join(gooey_root, 'images'), prefix = 'gooey/images')

a = Analysis(
  ['main.py'],
  hiddenimports=[
    'cython',
    'sklearn',
    'sklearn.neighbors.typedefs',
    'sklearn.neighbors.quad_tree',
    'sklearn.tree._utils'
  ],
  hookspath=None,
  runtime_hooks=None,
)

pyz = PYZ(
  a.pure
)

options = [('u', None, 'OPTION')]

exe = EXE(
  pyz,
  a.scripts,
  a.binaries,
  a.zipfiles,
  a.datas,
  options,
  gooey_languages,
  gooey_images,
  app_images,
  name='my_gui',
  debug=False,
  strip=None,
  upx=True,
  console=False
)

Just to make sure the pyi_rth_multiprocessing.py code is running I placed a print(sys.argv) on line 32 and it it produced:

['/path/to/dist/my_gui', '-B', '-s', '-S', '-E', '-c', 'from multiprocessing.semaphore_tracker import main;main(5)']

So the if statement is catching it correctly, but it is still spawning infinite processes.

Am I missing something in my code to make the rthook work as expected?

Edit:
Turning on debug mode and running the program results in:

[20795] PyInstaller Bootloader 3.x
[20795] LOADER: executable is /path/to/dist/my_gui
[20795] LOADER: homepath is /path/to/dist
[20795] LOADER: _MEIPASS2 is NULL
[20795] LOADER: archivename is /path/to/dist/my_gui
[20795] LOADER: Extracting binaries
[20795] LOADER: Executing self as child
[20795] LOADER: set _MEIPASS2 to /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG
[20795] LOADER [ARGV_EMU]: AppleEvent - processing...
[20795] LOADER [ARGV_EMU]: AppleEvent - installed handler.
[20795] LOADER [ARGV_EMU]: AppleEvent - calling ReceiveNextEvent
[20795] LOADER [ARGV_EMU]: ReceiveNextEvent got an event[20795] LOADER [ARGV_EMU]: processing events failed[20795] LOADER [ARGV_EMU]: Out of the event loop.[20795] LOADER: Registering signal handlers
[20800] PyInstaller Bootloader 3.x
[20800] LOADER: executable is /path/to/dist/my_gui
[20800] LOADER: homepath is /path/to/dist/
[20800] LOADER: _MEIPASS2 is /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG
[20800] LOADER: archivename is /path/to/dist/my_gui
[20800] LOADER: Already in the child - running user's code.
[20800] LOADER: Python library: /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG/libpython3.7m.dylib
[20800] LOADER: Loaded functions from Python library.
[20800] LOADER: Manipulating environment (sys.path, sys.prefix)
[20800] LOADER: sys.prefix is /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG
[20800] LOADER: Pre-init sys.path is /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG/base_library.zip:/var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG
[20800] LOADER: Setting runtime options
[20800] LOADER: Runtime option: u
[20800] LOADER: Initializing python
[20800] LOADER: Overriding Python's sys.path
[20800] LOADER: Post-init sys.path is /var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG/base_library.zip:/var/folders/7h/bqv27y4n3qv3y3c8_l_9944r0000gp/T/_MEIMtpkeG
[20800] LOADER: Setting sys.argv
[20800] LOADER: setting sys._MEIPASS
[20800] LOADER: importing modules from CArchive
[20800] LOADER: extracted struct
[20800] LOADER: callfunction returned...
[20800] LOADER: extracted pyimod01_os_path
[20800] LOADER: callfunction returned...
[20800] LOADER: extracted pyimod02_archive
[20800] LOADER: callfunction returned...
[20800] LOADER: extracted pyimod03_importers
[20800] LOADER: callfunction returned...
[20800] LOADER: Installing PYZ archive with Python modules.
[20800] LOADER: PYZ archive: PYZ-00.pyz
[20800] LOADER: Running pyiboot01_bootstrap.py
[20800] LOADER: Running pyi_rth_pkgres.py
[20800] LOADER: Running pyi_rth_multiprocessing.py
[20800] LOADER: Running main.py

repeated over and over (once for each of the processes spawned).

I am on same specs as @CKrawczyk, and also run into the same problem. Will try to boil it down to a simple example.

import sklearn
import subprocess

print('start')
proc = subprocess.Popen(['sleep','1'])
proc.wait()
print('stop')

This script when pyinstalled, will continuously reboot. Commenting out import sklearn fixes it. what could this be?

yep. same here sklearn 0.20 seems to have a problem :(

Same here, sklearn 0.19.2 and 0.20.2

Just call multiprocessing.freeze_support() at first near the top of your program.

@kingkastle To be on the safe side the call to freeze_support() must be done prior to all other imports. Move it as far to teh top as possible.

Hey @htgoebel according to python docs it should be the first thing after if __name__ == '__main__': which I think is what I have done.

@kingkastle This docs clearly say " For example:".

OK @htgoebel I deleted my comment with the main.py suggestion, so you could provide the example that other folks should use.

I hope this thread is still active because I have the same problem.

I have a Python3/PyQT5 app, frozen using PyInstaller. On macOS my Python3 code runs fine from a CLI. The version frozen using PyInstaller also runs successfully but creates new copies of itself every couple of seconds in what looks like an infinite loop.

By way of background I have successfully frozen this app on Ubuntu 18.05 and Windows 10 without seeing this problem.

I am running Python 3.7.2 with PyInstaller 3.4 and PyQT5 5.12, on macOS Mojave 10.14.3.

Given the Python code runs perfectly from the CLI and from PyCharm, and the frozen package runs (but with multiple copies of itself) it looks like the problem is somewhere in the freezing process. I am not directly using multiprocessing but am using scikit-learn that does, I believe, use multiprocessing. I have tried calling freeze_support() at the top, and various other places but none fix the problem.

Like CKrawczyk my .spec file does have sklearn as a hidden import. I know that freeze_support() should be called before sklearn so I wonder if having sklearn as a hidden import in the .spec file is somehow causing the problem? Other than that I have hit a brick wall. Can anyone suggest how we can get to the bottom of this problem please?

I'm in the same boat as @CliffMitchell. My PyQt5 app (I use this unofficial conversion for 2.7 https://github.com/pyqt/python-qt5 as the core library of the roject, platformio, does not support python 3+) runs into an infinite loop and just keeps swpaning new copies of the main window forever.

I have checked exerimentally that the cause of the issue is a call to Subprocess.call inside the platformio library. I have of course used the infamous line "subprocess.freeze_support" without any success. I have also gone through the Pyinstaller "recipies" in the repo for subprocess and multiprocessing libraries.

At this point, I really don't know what else to do. The app works perfectly fine in PyCharm and this problem only arises once called from the .exe generated by Pyinstaller (happens also with --onedir option outputting a folder version of the program)

Any help would be appreciated.

Many thanks

i solved!

mherrmann/fbs#87 (comment)

Thanks ! This actually worked for me:

Before:

multiprocessing.set_start_method('spawn')

After:

multiprocessing.freeze_support()
multiprocessing.set_start_method('spawn')
Was this page helpful?
0 / 5 - 0 ratings