Pymc3: Multiprocessing fails when sampling multiple chains using multiple cores

Created on 7 Aug 2018  路  27Comments  路  Source: pymc-devs/pymc3

Since PR https://github.com/pymc-devs/pymc3/pull/3011 I have been having troubles sampling multiple chains with multiple cores. In Jupyter notebook I get _random_ kernel shutdowns and therefore I haven't managed to pinpoint what is the problem (it seems that the more complicated the model is, the higher the crash rate). However, I found a systematic issue when using the python interpreter only (not the Jupyter kernel): if I sample more than one chain using more than 1 core (say, 2 chains and 2 cores) Python crashes. Sampling multiple chains with 1 core, or 1 chain with multiple cores is fine. On a Jupyter notebook I do not encounter any problems.

The minimal example is attached (please run it as a script, and not on a Jupyter kernel):

import numpy as np
import pandas as pd

import theano
import pymc3 as pm

print('*** Start script ***')
print(f'{pm.__name__}: v. {pm.__version__}')
print(f'{theano.__name__}: v. {theano.__version__}')

SEED = 20180730
np.random.seed(SEED)

# Generate data
mu_real = 0
sd_real = 1
n_samples = 1000
y = np.random.normal(loc=mu_real, scale=sd_real, size=n_samples)

# Bayesian modelling
with pm.Model() as model:

    mu = pm.Normal('mu', mu=0, sd=10)
    sd = pm.HalfNormal('sd', sd=10)

    # Likelihood
    likelihood = pm.Normal('likelihood', mu=mu, sd=sd, observed=y)    
    trace = pm.sample(chains=2, cores=2, random_seed=SEED)

print('Done!')

Running with chains=2 and cores=2 throws the error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
Traceback (most recent call last):
  File "test_multicore_multichain.py", line 28, in <module>
    run_name="__mp_main__")
trace = pm.sample(chains=2, cores=2, random_seed=SEED)  File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 263, in run_path

  File "d:\dev\pymc3\pymc3\sampling.py", line 451, in sample
    pkg_name=pkg_name, script_name=fname)
  File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 96, in _run_module_code
    trace = _mp_sample(**sample_args)
      File "d:\dev\pymc3\pymc3\sampling.py", line 998, in _mp_sample
mod_name, mod_spec, pkg_name, script_name)
  File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\moran\Desktop\test_multicore_multichain.py", line 28, in <module>
    chain, progressbar)
trace = pm.sample(chains=2, cores=2, random_seed=SEED)  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in __init__

  File "d:\dev\pymc3\pymc3\sampling.py", line 451, in sample
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in <listcomp>
    trace = _mp_sample(**sample_args)
for chain, seed, start in zip(range(chains), seeds, start_points)  File "d:\dev\pymc3\pymc3\sampling.py", line 998, in _mp_sample

  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 182, in __init__
    self._process.start()
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\process.py", line 105, in start
    chain, progressbar)
self._popen = self._Popen(self)  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in __init__

  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 223, in _Popen
    for chain, seed, start in zip(range(chains), seeds, start_points)
      File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in <listcomp>
return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 322, in _Popen
    for chain, seed, start in zip(range(chains), seeds, start_points)
      File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 182, in __init__
return Popen(process_obj)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    self._process.start()
reduction.dump(process_obj, to_child)  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\process.py", line 105, in start

  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\reduction.py", line 60, in dump
    self._popen = self._Popen(self)
ForkingPickler(file, protocol).dump(obj)  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 223, in _Popen

BrokenPipeError:     [Errno 32] Broken pipereturn _default_context.get_context().Process._Popen(process_obj)

  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

The interesting thing is that the print statements in the script are duplicated (which does not happen when chains=2 and cores=1, or chains=1 and cores=2)

*** Start script ***
pymc3: v. 3.5
theano: v. 1.0.2
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sd, mu]
*** Start script ***
pymc3: v. 3.5
theano: v. 1.0.2
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sd, mu]

I am on master on both PyMC3 and Theano.

  • PyMC3 Version: 3.5
  • Theano Version: 1.0.2
  • Python Version: 3.6.6
  • Operating system: Windows 10
winOS

Most helpful comment

@Jeff-Winchell have you try running the suggestion by @aseyboldt? These are all valid suggestions, what would be productive is that you try to follow these suggestions first. Also, name-dropping is not a valid way to have a productive conversation.

We do not appreciate these hostile attitudes towards our developers/users, if you keep doing this (either privately or publicly) I will have to block and report you according to our community guidelines.

All 27 comments

Possible Windows related... @aseyboldt

Yes, this looks like an issue with multiprocessing on windows.

Can you try this:

import numpy as np
import pandas as pd

import theano
import pymc3 as pm

print('*** Start script ***')
print(f'{pm.__name__}: v. {pm.__version__}')
print(f'{theano.__name__}: v. {theano.__version__}')

if __name__ == '__main__':
    SEED = 20180730
    np.random.seed(SEED)

    # Generate data
    mu_real = 0
    sd_real = 1
    n_samples = 1000
    y = np.random.normal(loc=mu_real, scale=sd_real, size=n_samples)

    # Bayesian modelling
    with pm.Model() as model:

        mu = pm.Normal('mu', mu=0, sd=10)
        sd = pm.HalfNormal('sd', sd=10)

        # Likelihood
        likelihood = pm.Normal('likelihood', mu=mu, sd=sd, observed=y)    
        trace = pm.sample(chains=2, cores=2, random_seed=SEED)

    print('Done!')

But I don't really understand why it has trouble in the notebook. Can you post the versions of pyzmq, jupyter and ipython?

If I use the _if_ statement then the sampling works. Still, the print statements are executed multiple times:

*** Start script ***
pymc3: v. 3.5
theano: v. 1.0.2
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [sd, mu]
*** Start script ***
pymc3: v. 3.5
theano: v. 1.0.2
*** Start script ***
pymc3: v. 3.5
theano: v. 1.0.2
Sampling 2 chains: 100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻坾 2000/2000 [00:02<00:00, 724.48draws/s] Done!

  • pyzmq: 17.0.0
  • jupyter: 1.0.0
  • ipython: 6.4.0

Comment on the Jupyter notebook.

This particular script runs fine on Jupyter notebook (I crashed 1 time only after several attempts).
In general, however, the sampling with multiple cores got very unreliable. I have some more complicated models that won't run with multiple cores (in a fresh installed environment). For example, one notebook I am working on now (a softmax regression) crashes continuously when using multiple cores:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [beta_]
Sampling 2 chains:   0%|                                                                               | 0/8000 [00:00<?, ?draws/s]

forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC9C7F94C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC9C7F94C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC9C7F94C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC9C7F94C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown
[I 14:26:40.033 NotebookApp] Interrupted...
[I 14:26:40.033 NotebookApp] Shutting down 2 kernels
[I 14:26:40.135 NotebookApp] Kernel shutdown: eaa60eb4-6bae-4c91-82bf-6bd5648ddf35
[I 14:26:40.135 NotebookApp] Kernel shutdown: e41f13f3-e731-4812-8130-97a7a6220fd7

If I run the softmax regression script as python script (without the if __name__ == '__main__':) I get the error

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [beta_]
3.5
1.0.2
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 4 jobs)
NUTS: [beta_]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
Traceback (most recent call last):
  File "test_softmax_multicore.py", line 38, in <module>
    run_name="__mp_main__")
      File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 263, in run_path
trace = pm.sample(draws=3000, tune=1000, chains=2, cores=4, random_seed=SEED)
  File "d:\dev\pymc3\pymc3\sampling.py", line 451, in sample
    pkg_name=pkg_name, script_name=fname)
  File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 96, in _run_module_code
    trace = _mp_sample(**sample_args)
      File "d:\dev\pymc3\pymc3\sampling.py", line 998, in _mp_sample
mod_name, mod_spec, pkg_name, script_name)
  File "C:\Miniconda3\envs\bayes\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)chain, progressbar)

  File "D:\dev\GLM_with_PyMC3\notebooks\test_softmax_multicore.py", line 38, in <module>
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in __init__
    trace = pm.sample(draws=3000, tune=1000, chains=2, cores=4, random_seed=SEED)
  File "d:\dev\pymc3\pymc3\sampling.py", line 451, in sample
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in <listcomp>
    trace = _mp_sample(**sample_args)
  File "d:\dev\pymc3\pymc3\sampling.py", line 998, in _mp_sample
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 182, in __init__
        self._process.start()chain, progressbar)

  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\process.py", line 105, in start
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in __init__
    self._popen = self._Popen(self)
for chain, seed, start in zip(range(chains), seeds, start_points)  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 223, in _Popen

  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 275, in <listcomp>
    return _default_context.get_context().Process._Popen(process_obj)
      File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 322, in _Popen
for chain, seed, start in zip(range(chains), seeds, start_points)
  File "d:\dev\pymc3\pymc3\parallel_sampling.py", line 182, in __init__
    return Popen(process_obj)
      File "C:\Miniconda3\envs\bayes\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
self._process.start()
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\process.py", line 105, in start
    reduction.dump(process_obj, to_child)
      File "C:\Miniconda3\envs\bayes\lib\multiprocessing\reduction.py", line 60, in dump
self._popen = self._Popen(self)
      File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 223, in _Popen
ForkingPickler(file, protocol).dump(obj)
    BrokenPipeErrorreturn _default_context.get_context().Process._Popen(process_obj):
[Errno 32] Broken pipe  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\context.py", line 322, in _Popen

    return Popen(process_obj)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

If I wrap the script into if __name__ == '__main__': I get the error

sampling 2 chains:   0%|                                                                               | 0/8000 [00:00<?, ?draws/s] You can find the C code in this temporary file: C:\Users\moran\AppData\Local\Temp\theano_compilation_error__a0g2s_m
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1082, in _constructor_Function
    f = maker.create(input_storage, trustme=True)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1715, in create
    input_storage=input_storage_lists, storage_map=storage_map)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\link.py", line 699, in make_thunk
    storage_map=storage_map)[:3]
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\vm.py", line 1091, in make_all
    impl=impl))
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 955, in make_thunk
    no_recycling)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
    output_storage=node_output_storage)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
    key=key, lnk=self, keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cmodule.py", line 1181, in module_from_key
    module = lnk.compile_cmodule(location)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1523, in compile_cmodule
    preargs=preargs)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cmodule.py", line 2388, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
Exception: ('The following error happened while compiling the node', Softmax(Dot22.0), '\n', 'Compilation failed (return status=3): ', '[Softmax(<TensorType(float64, matrix)>)]')
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC98B294C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFC98B294C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFCD18B56FD  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFCD38E3034  Unknown               Unknown  Unknown
ntdll.dll          00007FFCD4A11431  Unknown               Unknown  Unknown

So it seems that there are two issues here:

  • The new multiprocessing backend requires an if __name__ == '__main__' in stand-alone scripts. I somehow managed to miss the fact that this impacts users when writing the new backend. I'm not really sure how bad this is right now. It is a backward incompatible change, and at the very least it should be mentioned in the documentation. Maybe we should even disable the new backend on windows for now. But on the upside, at least I understand why this is happening.

I'm trying to reproduce this locally, can you send me an example that fails with the second error?
What is the output of np.__config__.show()?

I have some vague ideas where this might be coming from, and if my hunch is right, setting one of OMP_NUM_THREADS=1, MKL_THREADING_LAYER=sequential or MKL_THREADING_LAYER=GNU might help. To do that, execute

import os
# one of
os.environ['MKL_THREADING_LAYER'] = 'sequential'
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_THREADING_LAYER'] = 'GNU'

before you import anything else.

And thank you for reporting this :-)

The np.__config__.show() outputs:

mkl_info:
    libraries = ['mkl_core_dll', 'mkl_intel_lp64_dll', 'mkl_intel_thread_dll']
    library_dirs = ['C:/Miniconda3/envs/bayes\\Library\\lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\lib', 'C:/Miniconda3/envs/bayes\\Library\\include']
blas_mkl_info:
    libraries = ['mkl_core_dll', 'mkl_intel_lp64_dll', 'mkl_intel_thread_dll']
    library_dirs = ['C:/Miniconda3/envs/bayes\\Library\\lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\lib', 'C:/Miniconda3/envs/bayes\\Library\\include']
blas_opt_info:
    libraries = ['mkl_core_dll', 'mkl_intel_lp64_dll', 'mkl_intel_thread_dll']
    library_dirs = ['C:/Miniconda3/envs/bayes\\Library\\lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\lib', 'C:/Miniconda3/envs/bayes\\Library\\include']
lapack_mkl_info:
    libraries = ['mkl_core_dll', 'mkl_intel_lp64_dll', 'mkl_intel_thread_dll']
    library_dirs = ['C:/Miniconda3/envs/bayes\\Library\\lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\lib', 'C:/Miniconda3/envs/bayes\\Library\\include']
lapack_opt_info:
    libraries = ['mkl_core_dll', 'mkl_intel_lp64_dll', 'mkl_intel_thread_dll']
    library_dirs = ['C:/Miniconda3/envs/bayes\\Library\\lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2016.4.246\\windows\\mkl\\lib', 'C:/Miniconda3/envs/bayes\\Library\\include']

I tried to set the environment variables but it does not solve the issue, unfortunately.
I have attached the Jupyter notebook (+data) that keeps crashing on my side. It is based on the softmax regression on DBDA2 book.
test_softmax_multicore.zip

Thank _you_ for looking into this.

It works for me, but I have a different blas installed. How did you install python/numpy/pymc3?

Can you maybe also post the output of pip freeze and conda list?

I installed numpy (and scipy and all the PyMC3 dependencies) via conda (because it links the packages to the mkl library). Then I installed Theano and PyMC3 via pip git.

Conda list

conda list
# packages in environment at C:\Miniconda3\envs\bayes:
#
# Name                    Version                   Build  Channel
backcall                  0.1.0                    py36_0
blas                      1.0                         mkl
bleach                    2.1.3                    py36_0
ca-certificates           2018.03.07                    0
certifi                   2018.4.16                py36_0
colorama                  0.3.9            py36h029ae33_0
cycler                    0.10.0           py36h009560c_0
cython                    0.28.3           py36hfa6e2cd_0
decorator                 4.3.0                    py36_0
entrypoints               0.2.3            py36hfd66bb0_2
freetype                  2.8                  h51f8f2c_1
h5py                      2.8.0                     <pip>
html5lib                  1.0.1            py36h047fa9f_0
icc_rt                    2017.0.4             h97af966_0
icu                       58.2                 ha66f8fd_1
intel-openmp              2018.0.3                      0
ipykernel                 4.8.2                    py36_0
ipython                   6.4.0                    py36_0
ipython_genutils          0.2.0            py36h3c5d0ee_0
ipywidgets                7.2.1                    py36_0
jedi                      0.12.0                   py36_1
jinja2                    2.10             py36h292fed1_0
joblib                    0.12.0                    <pip>
jpeg                      9b                   hb83a4c4_2
jsonschema                2.6.0            py36h7636477_0
jupyter                   1.0.0                    py36_4
jupyter_client            5.2.3                    py36_0
jupyter_console           5.2.0            py36h6d89b47_1
jupyter_core              4.4.0            py36h56e9d50_0
kiwisolver                1.0.1            py36h12c3424_0
libpng                    1.6.34               h79bbb47_0
libpython                 2.1                      py36_0
libsodium                 1.0.16               h9d3ae62_0
m2w64-binutils            2.25.1                        5    msys2
m2w64-bzip2               1.0.6                         6    msys2
m2w64-crt-git             5.0.0.4636.2595836               2    msys2
m2w64-gcc                 5.3.0                         6    msys2
m2w64-gcc-ada             5.3.0                         6    msys2
m2w64-gcc-fortran         5.3.0                         6    msys2
m2w64-gcc-libgfortran     5.3.0                         6    msys2
m2w64-gcc-libs            5.3.0                         7    msys2
m2w64-gcc-libs-core       5.3.0                         7    msys2
m2w64-gcc-objc            5.3.0                         6    msys2
m2w64-gmp                 6.1.0                         2    msys2
m2w64-headers-git         5.0.0.4636.c0ad18a               2    msys2
m2w64-isl                 0.16.1                        2    msys2
m2w64-libiconv            1.14                          6    msys2
m2w64-libmangle-git       5.0.0.4509.2e5a9a2               2    msys2
m2w64-libwinpthread-git   5.0.0.4634.697f757               2    msys2
m2w64-make                4.1.2351.a80a8b8               2    msys2
m2w64-mpc                 1.0.3                         3    msys2
m2w64-mpfr                3.1.4                         4    msys2
m2w64-pkg-config          0.29.1                        2    msys2
m2w64-toolchain           5.3.0                         7    msys2
m2w64-tools-git           5.0.0.4592.90b8472               2    msys2
m2w64-windows-default-manifest 6.4                           3    msys2
m2w64-winpthreads-git     5.0.0.4634.697f757               2    msys2
m2w64-zlib                1.2.8                        10    msys2
markupsafe                1.0              py36h0e26971_1
matplotlib                2.2.2            py36h153e9ff_1
mistune                   0.8.3            py36hfa6e2cd_1
mkl                       2018.0.3                      1
mkl-service               1.1.2            py36h57e144c_4
mkl_fft                   1.0.1            py36h452e1ab_0
mkl_random                1.0.1            py36h9258bd6_0
msys2-conda-epoch         20160418                      1    msys2
nbconvert                 5.3.1            py36h8dc0fde_0
nbformat                  4.4.0            py36h3a5bc1b_0
notebook                  5.5.0                    py36_0
numpy                     1.14.3           py36h9fa60d3_2
numpy-base                1.14.3           py36h5c71026_2
openssl                   1.0.2o               h8ea7d77_0
pandas                    0.23.1           py36h830ac7b_0
pandoc                    2.2.1                h1a437c5_0
pandocfilters             1.4.2            py36h3ef6317_1
parso                     0.2.1                    py36_0
patsy                     0.5.0                    py36_0
pickleshare               0.7.4            py36h9de030f_0
pip                       10.0.1                   py36_0
prompt_toolkit            1.0.15           py36h60b8f86_0
pygments                  2.2.0            py36hb010967_0
pymc3                     3.4.1                     <pip>
pyparsing                 2.2.0            py36h785a196_1
pyqt                      5.9.2            py36h1aa27d4_0
python                    3.6.6                hea74fb7_0
python-dateutil           2.7.3                    py36_0
pytz                      2018.4                   py36_0
pywinpty                  0.5.4                    py36_0
pyzmq                     17.0.0           py36hfa6e2cd_1
qt                        5.9.6            vc14h62aca36_0  [vc14]
qtconsole                 4.3.1            py36h99a29a9_0
scipy                     1.1.0            py36h672f292_0
seaborn                   0.8.1            py36h9b69545_0
send2trash                1.5.0                    py36_0
setuptools                39.2.0                   py36_0
simplegeneric             0.8.1                    py36_2
sip                       4.19.8           py36h6538335_0
six                       1.11.0           py36h4db2310_1
sqlite                    3.24.0               h7602738_0
statsmodels               0.9.0            py36h452e1ab_0
terminado                 0.8.1                    py36_1
testpath                  0.3.1            py36h2698cfe_0
Theano                    1.0.2+26.gd0420e3d9           <pip>
tornado                   5.0.2                    py36_0
tqdm                      4.23.4                    <pip>
traitlets                 4.3.2            py36h096827d_0
vc                        14                   h0510ff6_3
vs2015_runtime            14.0.25123                    3
wcwidth                   0.1.7            py36h3d5aa90_0
webencodings              0.5.1            py36h67c50ae_1
wheel                     0.31.1                   py36_0
widgetsnbextension        3.2.1                    py36_0
wincertstore              0.2              py36h7fe50ca_0
winpty                    0.4.3                         4
zeromq                    4.2.5                hc6251cf_0
zlib                      1.2.11               h8395fce_2

pip freeze

backcall==0.1.0
bleach==2.1.3
certifi==2018.4.16
colorama==0.3.9
cycler==0.10.0
Cython==0.28.3
decorator==4.3.0
entrypoints==0.2.3
h5py==2.8.0
html5lib==1.0.1
ipykernel==4.8.2
ipython==6.4.0
ipython-genutils==0.2.0
ipywidgets==7.2.1
jedi==0.12.0
Jinja2==2.10
joblib==0.12.0
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.2.3
jupyter-console==5.2.0
jupyter-core==4.4.0
kiwisolver==1.0.1
MarkupSafe==1.0
matplotlib==2.2.2
mistune==0.8.3
mkl-fft==1.0.0
mkl-random==1.0.1
nbconvert==5.3.1
nbformat==4.4.0
notebook==5.5.0
numpy==1.14.3
pandas==0.23.1
pandocfilters==1.4.2
parso==0.2.1
patsy==0.5.0
pickleshare==0.7.4
prompt-toolkit==1.0.15
Pygments==2.2.0
-e git+https://github.com/JackCaster/pymc3.git@98545be7ddad700b5fb02be2893d2fedae22c110#egg=pymc3
pyparsing==2.2.0
python-dateutil==2.7.3
pytz==2018.4
pywinpty==0.5.4
pyzmq==17.0.0
qtconsole==4.3.1
scipy==1.1.0
seaborn==0.8.1
Send2Trash==1.5.0
simplegeneric==0.8.1
six==1.11.0
statsmodels==0.9.0
terminado==0.8.1
testpath==0.3.1
Theano==1.0.2+26.gd0420e3d9
tornado==5.0.2
tqdm==4.23.4
traitlets==4.3.2
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==3.2.1
wincertstore==0.2

I did some digging. I found out that the error forrtl: error (200): program aborting due to control-C event that makes the kernel crash is not unusual (see here). In the comments, they suggest to set the environment variable FOR_DISABLE_CONSOLE_CTRL_HANDLER to "1" or "T". I did so, and when the notebook crashes (because it still does ;( ), the traceback is:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1082, in _constructor_Function
    f = maker.create(input_storage, trustme=True)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1715, in create
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    input_storage=input_storage_lists, storage_map=storage_map)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\link.py", line 699, in make_thunk
  File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
      File "C:\Miniconda3\envs\bayes\lib\multiprocessing\spawn.py", line 115, in _main
storage_map=storage_map)[:3]    self = reduction.pickle.load(from_parent)

  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\vm.py", line 1091, in make_all
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1082, in _constructor_Function
    impl=impl))
f = maker.create(input_storage, trustme=True)  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 955, in make_thunk

  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\compile\function_module.py", line 1715, in create
    no_recycling)
      File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
input_storage=input_storage_lists, storage_map=storage_map)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\link.py", line 699, in make_thunk
        output_storage=node_output_storage)storage_map=storage_map)[:3]

  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\vm.py", line 1091, in make_all
    impl=impl))
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 955, in make_thunk
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    no_recycling)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
    keep_lock=keep_lock)
      File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
output_storage=node_output_storage)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
    key=key, lnk=self, keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cmodule.py", line 1151, in module_from_key
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    with compilelock.lock_ctx(keep_lock=keep_lock):
  File "C:\Miniconda3\envs\bayes\lib\contextlib.py", line 81, in __enter__
        keep_lock=keep_lock)return next(self.gen)

  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\compilelock.py", line 40, in lock_ctx
    get_lock(lock_dir=lock_dir, **kw)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\compilelock.py", line 86, in _get_lock
    lock(get_lock.lock_dir, **kw)
key=key, lnk=self, keep_lock=keep_lock)  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\compilelock.py", line 273, in lock

  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cmodule.py", line 1181, in module_from_key
    time.sleep(random.uniform(min_wait, max_wait))
KeyboardInterrupt
    module = lnk.compile_cmodule(location)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cc.py", line 1523, in compile_cmodule
    preargs=preargs)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\gof\cmodule.py", line 2343, in compile_str
    p_out = output_subprocess_Popen(cmd)
  File "C:\Miniconda3\envs\bayes\lib\site-packages\theano\misc\windows.py", line 80, in output_subprocess_Popen
    out = p.communicate()
  File "C:\Miniconda3\envs\bayes\lib\subprocess.py", line 843, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "C:\Miniconda3\envs\bayes\lib\subprocess.py", line 1092, in _communicate
    self.stdout_thread.join(self._remaining_time(endtime))
  File "C:\Miniconda3\envs\bayes\lib\threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "C:\Miniconda3\envs\bayes\lib\threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
[I 11:43:03.371 NotebookApp] Interrupted...
[I 11:43:03.371 NotebookApp] Shutting down 1 kernel
[I 11:43:08.431 NotebookApp] Kernel shutdown: cb25a99e-15f2-4f7f-b3c0-9706ab711a70

I hope this helps to shed light on the issue.

I have similar error (windows 2012 + pymc3 3.5(master) + theano 1.0.3 (master)
Here are ways, that can "work around" this situation for me^

  • setting n_jobs=1
  • removing init='advi' option
  • setting mode = FAST_COMPILE in .theanorc
  • downgrading pymc3 to 3.4.1
    and I still want to use multiprocess sampling with ADVI initialization at full speed with latest version of pymc3 =)
    Can someone suggest ways to debug this error? It seems that stack traces from multiple processes are mixed here.

I also have a short program that blows up (with a "broken pipe" message) as soon as I set chains > 1. I have a multicore machine (but then who doesn't). The code:

from theano import shared
from numpy import ones, array
from pymc3 import Model, Normal, Deterministic, Binomial, Metropolis, sample
from pymc3.math import invlogit

log_dosage = shared(array([-.86, -.3, -.05, .73]))
sample_size = shared(5 * ones(4, dtype=int))
deaths = array([0, 1, 3, 5])

with Model() as bioassay_model:
    alpha = Normal('alpha', 0, sd=100)
    beta = Normal('beta', 0, sd=100)
    theta = Deterministic("theta", invlogit(alpha + beta * log_dosage))
    observed_deaths = Binomial('observed_deaths', n=sample_size, p=theta, observed=deaths)
    trace = sample(draws=10000, start={"alpha":0.5}, step=Metropolis(), chains=2)

I have a GeForce GTX 1050 GPU running CUDA 8.0, CUDNN 7.1.3, theano 1.0.3, pymc3 3.5, python 3.6.6
my theano.rc:

[global]
device = cuda
force_device=True
optimizer = fast_run
optimizer_including=cudnn
mode=FAST_RUN

[nvcc]
fastmath = True
allow_gc=True

[lib]
cnmem = 0.8

[gpuarray]
preallocate=0.7

[scan]
allow_gc=True
allow_output_prealloc=True

The error message:

BrokenPipeError                           Traceback (most recent call last)
<ipython-input-1-fb96fbe5f1ac> in <module>
     13     theta = Deterministic("theta", invlogit(alpha + beta * log_dosage))
     14     observed_deaths = Binomial('observed_deaths', n=sample_size, p=theta, observed=deaths)
---> 15     trace = sample(draws=10000, start={"alpha":0.5}, step=Metropolis(), chains=2)

~\AppData\Local\conda\conda\envs\pymc3\lib\site-packages\pymc3\sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, nuts_kwargs, step_kwargs, progressbar, model, random_seed, live_plot, discard_tuned_samples, live_plot_kwargs, compute_convergence_checks, use_mmap, **kwargs)
    447             _print_step_hierarchy(step)
    448             try:
--> 449                 trace = _mp_sample(**sample_args)
    450             except pickle.PickleError:
    451                 _log.warning("Could not pickle model, sampling singlethreaded.")

~\AppData\Local\conda\conda\envs\pymc3\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, use_mmap, **kwargs)
    994         sampler = ps.ParallelSampler(
    995             draws, tune, chains, cores, random_seed, start, step,
--> 996             chain, progressbar)
    997         try:
    998             with sampler:

~\AppData\Local\conda\conda\envs\pymc3\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, chains, cores, seeds, start_points, step_method, start_chain_num, progressbar)
    273             ProcessAdapter(draws, tune, step_method,
    274                            chain + start_chain_num, seed, start)
--> 275             for chain, seed, start in zip(range(chains), seeds, start_points)
    276         ]
    277 

~\AppData\Local\conda\conda\envs\pymc3\lib\site-packages\pymc3\parallel_sampling.py in <listcomp>(.0)
    273             ProcessAdapter(draws, tune, step_method,
    274                            chain + start_chain_num, seed, start)
--> 275             for chain, seed, start in zip(range(chains), seeds, start_points)
    276         ]
    277 

~\AppData\Local\conda\conda\envs\pymc3\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, step_method, chain, seed, start)
    180             draws, tune, seed)
    181         # We fork right away, so that the main process can start tqdm threads
--> 182         self._process.start()
    183 
    184     @property

~\AppData\Local\conda\conda\envs\pymc3\lib\multiprocessing\process.py in start(self)
    103                'daemonic processes are not allowed to have children'
    104         _cleanup()
--> 105         self._popen = self._Popen(self)
    106         self._sentinel = self._popen.sentinel
    107         # Avoid a refcycle if the target function holds an indirect

~\AppData\Local\conda\conda\envs\pymc3\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

~\AppData\Local\conda\conda\envs\pymc3\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

~\AppData\Local\conda\conda\envs\pymc3\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     63             try:
     64                 reduction.dump(prep_data, to_child)
---> 65                 reduction.dump(process_obj, to_child)
     66             finally:
     67                 set_spawning_popen(None)

~\AppData\Local\conda\conda\envs\pymc3\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

BrokenPipeError: [Errno 32] Broken pipe

I'm pretty sure that is not the same issue as the original (which is windows related). About the original bug: I've been trying to reproduce this on my own machine for some time, but so far I haven't managed to do that. This makes it rather hard to fix.

@Jeff-Winchell About your problem:
I'd guess that this might be gpu related. Does it also happen if you use the cpu? Using a gpu for a problem like that doesn't make the slightest bit of sense by the way.
As a general note: Before starting to write strange emails when you don't get a reply to a bug report, it could help to do a bit more work yourself first:

  • Is that the right place to post the bug report? (This should have been a separate issue, it is probably unrelated)
  • What are the circumstances that make that bug appear? ie does it also happen if you don't use the GPU? Does it depend on the operating system (you didn't even say which one you are using)
  • Does the problem go away if you change the model somehow?
  • Is the problem sampler specific? Does it also happen if you use NUTS? Does it depend on the start argument?

Have you run the very short code example I gave and replicated the bug? If you have, its not clear why most of your post was written. If you haven't run it, it's unclear why any of your post was written.

I was frankly taken aback by your post, but maybe you don't see why. I'm a software engineer, not a hacker. My teachers (and LinkedIn connections) include Ward Cunningham, Bertrand Meyer, Meilir Page-Jones, Gerry Weinberg, James Bach, Andy Hunt. I don't ship code to production code with known bugs in it. Ever.

If you can't replicate the bug I'd be happy to help come up with ideas why not. Otherwise, it's unproductive.

@Jeff-Winchell have you try running the suggestion by @aseyboldt? These are all valid suggestions, what would be productive is that you try to follow these suggestions first. Also, name-dropping is not a valid way to have a productive conversation.

We do not appreciate these hostile attitudes towards our developers/users, if you keep doing this (either privately or publicly) I will have to block and report you according to our community guidelines.

The first message to me was more hostile than my response was. Different people have different ideas about name dropping. So I guess you can ban me for saying my address is [email protected].

What else was hostile besides making it clear that I know a lot more than the first poster assumed I did when asking me to do a bunch of things that aren't useful?

FYI, none of those names I mentioned would even DREAM of banning someone for posting the message I did.

So go ahead and block me. The mere threat you made about doing so, so frivolously makes me want to challenge bullies publicly, just like they challenge me.

I looked at that thread. If I move ONLY the pymc3.sample function into a if __name__=='__main__' block AND I make sure my GPU is globally turned off, then it won't crash. As I ran into the same problem with some other code that uses the NUTS sampler, I saw that the same workaround corrects that.

However, disabling the GPU globally is not a great solutions, so the GPU problem needs to be fixed, and I don't know how more complex code can be managed with the if __name__ workaround. The real solution is to change the pycm3/theano/whatever code so that it executes under both LINUX and Windows instead of only worrying about Linux and ignoring the most widely used OS from the company with the largest market capitalization in the world.

The main problem is that the broken pipe error is not helpful for debugging. We have seen that the broken pipe is raised by the main process. When it tries to spawn the worker pool that should do the sampling, the workers raise exceptions before they have spawned and were created, so they don't manage to communicate their failure to the main process, and once the main process tries to communicate with the pool, it finds the communication pipe broken. The main issue that we are focusing to fix first is to capture the exceptions raised during the spawning of the worker pool. These exceptions are the keys to debug the sources of the failures. Some of them were caused by the lack of the if name main block, and others were caused because of functions that were not pickleable. Once we sort that out, we will be able to help better with whatever is happening because of the GPU.

Following commit 98fd63e18179ffb28734c73c459ccdaf04121b92, I ran again the script that kept failing under Windows. The script under test is:

import pymc3 as pm
print(pm.__version__)

import theano.tensor as tt
import theano
print(theano.__version__)

import patsy
import pandas as pd
import numpy as np

SEED = 20180727

df = pd.read_csv(r'https://gist.githubusercontent.com/JackCaster/d74b36a66c172e80d1bdcee61d6975bf/raw/a2aab8690af7cebbe39ec5e5b425fe9a9b9a674d/data.csv', 
                 dtype={'Y':'category'})

_, X = patsy.dmatrices('Y ~ 1 + X1 + X2', data=df)

# Number of categories
n_cat = df.Y.cat.categories.size
# Number of predictors
n_pred = X.shape[1]

with pm.Model() as model:

    ## `p`--quantity that I want to model--needs to have size (n_obs, n_cat). 
    ## Because `X` has size (n_obs, n_pred), then `beta` needs to have size (n_pred, n_cat)

    # priors for categories 1-2, excluding reference category 0 which is set to zero below (see DBDA2 p. 651 for explanation).   
    beta_ = pm.Normal('beta_', mu=0, sd=50, shape=(n_pred, n_cat-1))
    # add prior values zero for reference category 0. (add a column)  
    beta = pm.Deterministic('beta', tt.concatenate([tt.zeros((n_pred, 1)), beta_], axis=1))

    # The softmax function will squash the values in the range 0-1
    p = tt.nnet.softmax(tt.dot(np.asarray(X), beta))

    likelihood = pm.Categorical('likelihood', p=p, observed=df.Y.cat.codes.values)

    trace = pm.sample(chains=2, cores=2)

    print('DONE')

Unfortunately, the sampling still fails with cores > 1 (pymc3 v. 3.6, theano v. 1.0.3). The jupyter kernel shuts down as soon as the sampling begins:

Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta_]
Sampling 2 chains:   0%|                                                                   | 0/2000 [00:00<?, ?draws/s]

The traceback, which points to a compilation error, was:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\compile\function_module.py", line 1082, in _constructor_Function
    f = maker.create(input_storage, trustme=True)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\compile\function_module.py", line 1715, in create
    input_storage=input_storage_lists, storage_map=storage_map)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\link.py", line 699, in make_thunk
    storage_map=storage_map)[:3]
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\vm.py", line 1091, in make_all
    impl=impl))
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\op.py", line 955, in make_thunk
    no_recycling)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\op.py", line 858, in make_c_thunk
    output_storage=node_output_storage)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cc.py", line 1217, in make_thunk
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cc.py", line 1157, in __compile__
    keep_lock=keep_lock)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cc.py", line 1620, in cthunk_factory
    key=key, lnk=self, keep_lock=keep_lock)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cmodule.py", line 1181, in module_from_key
    module = lnk.compile_cmodule(location)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cc.py", line 1523, in compile_cmodule
    preargs=preargs)
  File "C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\cmodule.py", line 2391, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
Exception: ('The following error happened while compiling the node', InplaceDimShuffle{1,0}(Softmax.0), '\n', 'Compilation failed (return status=3): ', '[InplaceDimShuffle{1,0}(<TensorType(float64, matrix)>)]')
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFE414794C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFE79672763  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFE7ABD7E94  Unknown               Unknown  Unknown
ntdll.dll          00007FFE7D2CA251  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFE414794C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFE79672763  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFE7ABD7E94  Unknown               Unknown  Unknown
ntdll.dll          00007FFE7D2CA251  Unknown               Unknown  Unknown
forrtl: error (200): program aborting due to control-C event
Image              PC                Routine            Line        Source
libifcoremd.dll    00007FFE414794C4  Unknown               Unknown  Unknown
KERNELBASE.dll     00007FFE79672763  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFE7ABD7E94  Unknown               Unknown  Unknown
ntdll.dll          00007FFE7D2CA251  Unknown               Unknown  Unknown
[I 18:43:13.302 NotebookApp] Interrupted...
[I 18:43:13.303 NotebookApp] Shutting down 1 kernel
[I 18:43:13.403 NotebookApp] Kernel shutdown: f6d274f4-ffbf-428a-a996-751cd821bd4a

The temporary, compiled C code reports in the last line

Problem occurred during compilation with the command line below:
"C:\Miniconda3\envs\intro_to_pymc3\Library\mingw-w64\bin\g++.exe" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -march=haswell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-pcommit -mno-mwaitx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=4096 -mtune=haswell -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\numpy\core\include" -I"C:\Miniconda3\envs\intro_to_pymc3\include" -I"C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\c_code" -L"C:\Miniconda3\envs\intro_to_pymc3\libs" -L"C:\Miniconda3\envs\intro_to_pymc3" -o "C:\Users\moran\AppData\Local\Theano\compiledir_Windows-10-10.0.17763-SP0-Intel64_Family_6_Model_69_Stepping_1_GenuineIntel-3.6.6-64\tmpujapb2d5\m885ff006a95d626dac547a7bdfdb471bbf058622ece2b4435e42316c4012ea56.pyd" "C:\Users\moran\AppData\Local\Theano\compiledir_Windows-10-10.0.17763-SP0-Intel64_Family_6_Model_69_Stepping_1_GenuineIntel-3.6.6-64\tmpujapb2d5\mod.cpp" -lpython36

Does this shed more light on this matter?

EDIT: I also confirmed (as suggested by @elfwired) that setting theano.config.mode = 'FAST_COMPILE' allows to run the sampler successfully---but the sampling becomes very slow. I tried to fiddle with theano.config.mode, theano.config.optimizer, and theano.config.linker without much success.

This looks like a Theano problem, can you open an issue there? It looks very archaic to me.

This looks like a Theano problem, can you open an issue there? It looks very archaic to me.

Done, let's see 馃

EDIT: Just a note. When there is a compilation error, the traceback points to the temporary C code. At the end of that code, there is a line saying:

Problem occurred during compilation with the command line below:
"C:\Miniconda3\envs\intro_to_pymc3\Library\mingw-w64\bin\g++.exe" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -march=haswell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-pcommit -mno-mwaitx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=4096 -mtune=haswell -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\numpy\core\include" -I"C:\Miniconda3\envs\intro_to_pymc3\include" -I"C:\Miniconda3\envs\intro_to_pymc3\lib\site-packages\theano\gof\c_code" -L"C:\Miniconda3\envs\intro_to_pymc3\libs" -L"C:\Miniconda3\envs\intro_to_pymc3" -o "C:\Users\moran\AppData\Local\Theano\compiledir_Windows-10-10.0.17763-SP0-Intel64_Family_6_Model_69_Stepping_1_GenuineIntel-3.6.6-64\tmpujapb2d5\m885ff006a95d626dac547a7bdfdb471bbf058622ece2b4435e42316c4012ea56.pyd" "C:\Users\moran\AppData\Local\Theano\compiledir_Windows-10-10.0.17763-SP0-Intel64_Family_6_Model_69_Stepping_1_GenuineIntel-3.6.6-64\tmpujapb2d5\mod.cpp" -lpython36

I tried to run the command post-mortem, but the temp folder ...\tmpujapb2d5\... that does not exist (but a bunch of others do). I am wondering if there is a problem on how the multiprocessing pool is instantiated.

I got similar error for this snippet in MCMC application

with pm.Model() as sleep_model:

    # Create the alpha and beta parameters
    alpha = pm.Normal('alpha', mu=0.0, tau=0.01, testval=0.0)
    beta = pm.Normal('beta', mu=0.0, tau=0.01, testval=0.0)

    # Create the probability from the logistic function
    p = pm.Deterministic('p', 1. / (1. + tt.exp(beta * time + alpha)))

    # Create the bernoulli parameter which uses the observed dat
    observed = pm.Bernoulli('obs', p, observed=sleep_obs)

    # Starting values are found through Maximum A Posterior estimation
    # start = pm.find_MAP()

    # Using Metropolis Hastings Sampling
    step = pm.Metropolis()

    # Sample from the posterior using the sampling method
    #sleep_trace = pm.sample(N_SAMPLES, step=step, njobs=2);
    sleep_trace = pm.sample(N_SAMPLES, step=step);

Error message:

Multiprocess sampling (4 chains in 4 jobs)
CompoundStep
>Metropolis: [beta]
>Metropolis: [alpha]

BrokenPipeError                           Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, step_method, chain, seed, start)
    241         try:
--> 242             self._process.start()
    243         except IOError as e:

C:\ProgramData\Anaconda3\lib\multiprocessing\process.py in start(self)
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel

C:\ProgramData\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 

C:\ProgramData\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 

C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:

C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 

BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-26-4ad3b5446758> in <module>
     18     # Sample from the posterior using the sampling method
     19     #sleep_trace = pm.sample(N_SAMPLES, step=step, njobs=2);
---> 20     sleep_trace = pm.sample(N_SAMPLES, step=step);

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, **kwargs)
    435             _print_step_hierarchy(step)
    436             try:
--> 437                 trace = _mp_sample(**sample_args)
    438             except pickle.PickleError:
    439                 _log.warning("Could not pickle model, sampling singlethreaded.")

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, **kwargs)
    963     sampler = ps.ParallelSampler(
    964         draws, tune, chains, cores, random_seed, start, step,
--> 965         chain, progressbar)
    966     try:
    967         try:

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, chains, cores, seeds, start_points, step_method, start_chain_num, progressbar)
    359                 draws, tune, step_method, chain + start_chain_num, seed, start
    360             )
--> 361             for chain, seed, start in zip(range(chains), seeds, start_points)
    362         ]
    363 

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py in <listcomp>(.0)
    359                 draws, tune, step_method, chain + start_chain_num, seed, start
    360             )
--> 361             for chain, seed, start in zip(range(chains), seeds, start_points)
    362         ]
    363 

C:\ProgramData\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py in __init__(self, draws, tune, step_method, chain, seed, start)
    249                     # all its error message
    250                     time.sleep(0.2)
--> 251                     raise exc
    252             raise
    253 

RuntimeError: The communication pipe between the main process and its spawned children is broken.
In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process.
The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost.
A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback.
Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.

Running on Windows 10 with latest packages of everything.

*same thing for me (windows 10, spyder, installed through anaconda)
setting cores=1 in pm.sample() runs fine
*

Multiprocess sampling (4 chains in 4 jobs)
BinaryGibbsMetropolis: [rain, sprinkler]
Traceback (most recent call last):

File "", line 18, in
trace = pm.sample(20000, step=[pm.BinaryGibbsMetropolis([rain, sprinkler])], tune=tune, random_seed=124)

File "C:\Users\butle\Anaconda3\lib\site-packages\pymc3\sampling.py", line 437, in sample
trace = _mp_sample(**sample_args)

File "C:\Users\butle\Anaconda3\lib\site-packages\pymc3\sampling.py", line 965, in _mp_sample
chain, progressbar)

File "C:\Users\butle\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py", line 361, in __init__
for chain, seed, start in zip(range(chains), seeds, start_points)

File "C:\Users\butle\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py", line 361, in
for chain, seed, start in zip(range(chains), seeds, start_points)

File "C:\Users\butle\Anaconda3\lib\site-packages\pymc3\parallel_sampling.py", line 251, in __init__
raise exc

RuntimeError: The communication pipe between the main process and its spawned children is broken.
In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process.
The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost.
A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback.
Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.

Same for me. Windows 10. cores=1 works fine. Theano with cuda.
I used vscode to write the code, but ran it via cmd.

I am just getting into pymc and was following along the code on Osvaldo Martin's book.
This was the code I tried.

import numpy as np 
from scipy import stats
import pymc3 as pm 

np.random.seed(123)

if __name__ == "__main__":
    trials = 4 

    theta_real = 0.35 
    data = stats.bernoulli.rvs(p=theta_real, size=trials)

    with pm.Model() as our_first_model:
        theta = pm.Beta("theta", alpha=1., beta=1.)
        y = pm.Bernoulli("y", p=theta, observed=data)
        trace = pm.sample(1000, random_seed=123)

The following is the trace


Traceback (most recent call last):
  File "test.py", line 16, in <module>
    trace = pm.sample(1000, random_seed=123)
  File "C:\Anaconda\lib\site-packages\pymc3\sampling.py", line 437, in sample
    trace = _mp_sample(**sample_args)
  File "C:\Anaconda\lib\site-packages\pymc3\sampling.py", line 965, in _mp_sample
    chain, progressbar)
  File "C:\Anaconda\lib\site-packages\pymc3\parallel_sampling.py", line 361, in __init__
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "C:\Anaconda\lib\site-packages\pymc3\parallel_sampling.py", line 361, in <listcomp>
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "C:\Anaconda\lib\site-packages\pymc3\parallel_sampling.py", line 251, in __init__
    raise exc
RuntimeError: The communication pipe between the main process and its spawned children is broken.
In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process.
The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost.
A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback.
Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.

I am facing the same issue on a Debian machine. In particular the default ones on Google Dataproc https://cloud.google.com/compute/docs/images#debian 1.5-debian.

Setting one of:

os.environ['MKL_THREADING_LAYER'] = 'sequential'
os.environ['OMP_NUM_THREADS'] = '1'

allowed me to make the thing run but I suspect this is preventing me to scale things up. Indeed I noticed single chains appear to use just one cpu each. Is this a known issue for certain linux distributions? Is there a linux distro where multiprocessing is know to work well?

Was this page helpful?
0 / 5 - 0 ratings