Distributed: Cannot start LocalCluster

Created on 18 Dec 2018  Â·  3Comments  Â·  Source: dask/distributed

Whenever I try and start a LocalCluster the startup errors and I get a _lot_ of traceback. I can reproduce this on Linux (RHEL7) and macOS, in many recent versions of dask and distributed and on Python 2 and 3 (although in Python 2 this just hangs rather than producing lots of traceback).

I'm going to try and dig into this a bit.

It's possible this relates to #2376 (the traceback looks similar).

For example (setting up a LocalCluster with just the one worker as the stacktrace below gets duplicated per worker):

from distributed import Client, LocalCluster

lc = LocalCluster(n_workers=1)
/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/bokeh/core.py:57: UserWarning: 
Port 8787 is already in use. 
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
  warnings.warn('\n' + msg)
Traceback (most recent call last):
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 261, in main
    old_handlers)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 297, in _serve_one
    code = spawn._main(child_r)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 114, in _main
    prepare(preparation_data)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/DPeterK/data/random_data_cubes/distributed_bag_load_save.py", line 12, in <module>
    lc = LocalCluster(n_workers=1)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
    self.start(ip=ip, n_workers=n_workers)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
    self.sync(self._start, **kwargs)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
    return sync(self.loop, func, *args, **kwargs)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
    six.reraise(*error[0])
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
    raise value
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
    result[0] = yield future
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
    yield [self._start_worker(**self.worker_kwargs) for i in range(n_workers)]
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 208, in _start_worker
    yield w._start()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 157, in _start
    response = yield self.instantiate()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 226, in instantiate
    self.process.start()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 370, in start
    yield self.process.start()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 35, in _call_and_set_future
    res = func(*args, **kwargs)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 184, in _start
    process.start()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/context.py", line 291, in _Popen
    return Popen(process_obj)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 35, in __init__
    super().__init__(process_obj)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
distributed.nanny - WARNING - Worker process 14323 exited with status 1
Traceback (most recent call last):
  File "distributed_bag_load_save.py", line 12, in <module>
    lc = LocalCluster(n_workers=1)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
    self.start(ip=ip, n_workers=n_workers)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
    self.sync(self._start, **kwargs)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
    return sync(self.loop, func, *args, **kwargs)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
    six.reraise(*error[0])
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
    raise value
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
    result[0] = yield future
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
    yield [self._start_worker(**self.worker_kwargs) for i in range(n_workers)]
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
    value = future.result()
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
    result_list.append(f.result())
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 217, in _start_worker
    raise gen.TimeoutError("Worker failed to start")
tornado.util.TimeoutError: Worker failed to start
/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
  len(cache))



md5-01cb5d820ba642f087b97adfa214bb2e



$ conda list
# packages in environment at /Users/DPeterK/miniconda3/envs/dask:
#
blas                      1.1                    openblas    conda-forge
bokeh                     1.0.2                 py37_1000    conda-forge
ca-certificates           2018.11.29           ha4d7672_0    conda-forge
certifi                   2018.11.29            py37_1000    conda-forge
click                     7.0                        py_0    conda-forge
cloudpickle               0.6.1                      py_0    conda-forge
cytoolz                   0.9.0.1          py37h470a237_1    conda-forge
dask                      1.0.0                      py_0    conda-forge
dask-core                 1.0.0                      py_0    conda-forge
distributed               1.25.1                py37_1000    conda-forge
freetype                  2.9.1                h6debe1e_4    conda-forge
heapdict                  1.0.0                 py37_1000    conda-forge
jinja2                    2.10                       py_1    conda-forge
jpeg                      9c                   h470a237_1    conda-forge
libffi                    3.2.1                hfc679d8_5    conda-forge
libgfortran               3.0.0                         1    conda-forge
libpng                    1.6.36               ha92aebf_0    conda-forge
libtiff                   4.0.10               he6b73bb_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
markupsafe                1.1.0            py37h470a237_0    conda-forge
msgpack-python            0.6.0            py37h2d50403_0    conda-forge
ncurses                   6.1                  hfc679d8_2    conda-forge
numpy                     1.15.4          py37_blas_openblashb06ca3d_0  [blas_openblas]  conda-forge
olefile                   0.46                       py_0    conda-forge
openblas                  0.3.3                ha44fe06_1    conda-forge
openssl                   1.0.2p               h470a237_1    conda-forge
packaging                 18.0                       py_0    conda-forge
pandas                    0.23.4           py37hf8a1672_0    conda-forge
partd                     0.3.9                      py_0    conda-forge
pillow                    5.3.0            py37hc736899_0    conda-forge
pip                       18.1                  py37_1000    conda-forge
psutil                    5.4.8            py37h470a237_0    conda-forge
pyparsing                 2.3.0                      py_0    conda-forge
python                    3.7.1                h46c1a51_0    conda-forge
python-dateutil           2.7.5                      py_0    conda-forge
pytz                      2018.7                     py_0    conda-forge
pyyaml                    3.13             py37h470a237_1    conda-forge
readline                  7.0                  haf1bffa_1    conda-forge
setuptools                40.6.3                   py37_0    conda-forge
six                       1.12.0                py37_1000    conda-forge
sortedcontainers          2.1.0                      py_0    conda-forge
sqlite                    3.26.0               hb1c47c0_0    conda-forge
tblib                     1.3.2                      py_1    conda-forge
tk                        8.6.9                ha92aebf_0    conda-forge
toolz                     0.9.0                      py_1    conda-forge
tornado                   5.1.1            py37h470a237_0    conda-forge
wheel                     0.32.3                   py37_0    conda-forge
xz                        5.2.4                h470a237_1    conda-forge
yaml                      0.1.7                h470a237_1    conda-forge
zict                      0.1.3                      py_0    conda-forge
zlib                      1.2.11               h470a237_3    conda-forge

Most helpful comment

This error usually occurs when you start a LocalCluster in a Python script
not under the if __name__ == '__main__': block. The same thing happens
if you create a multiprocessing.Pool outside of that block.

On Tue, Dec 18, 2018 at 5:48 AM Peter Killick notifications@github.com
wrote:

Whenever I try and start a LocalCluster the startup errors and I get a
lot of traceback. I can reproduce this on Linux (RHEL7) and macOS, in
many recent versions of dask and distributed and on Python 2 and 3
(although in Python 2 this just hangs rather than producing lots of
traceback).

I'm going to try and dig into this a bit.

It's possible this relates to #2376
https://github.com/dask/distributed/issues/2376 (the traceback looks
similar).

For example (setting up a LocalCluster with just the one worker as the
stacktrace below gets duplicated per worker):

from distributed import Client, LocalCluster

lc = LocalCluster(n_workers=1)

/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/bokeh/core.py:57: UserWarning:
Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
warnings.warn('n' + msg)
Traceback (most recent call last):
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 261, in main
old_handlers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 297, in _serve_one
code = spawn._main(child_r)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 114, in _main
prepare(preparation_data)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/DPeterK/data/random_data_cubes/distributed_bag_load_save.py", line 12, in
lc = LocalCluster(n_workers=1)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
self.start(ip=ip, n_workers=n_workers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
self.sync(self._start, kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
return sync(self.loop, func, args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
six.reraise(error[0])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(
self.worker_kwargs) for i in range(n_workers)]
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 208, in _start_worker
yield w._start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 157, in _start
response = yield self.instantiate()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 226, in instantiate
self.process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 370, in start
yield self.process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 35, in _call_and_set_future
res = func(args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 184, in _start
process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/context.py", line 291, in _Popen
return Popen(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 35, in __init__
super().__init__(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

distributed.nanny - WARNING - Worker process 14323 exited with status 1
Traceback (most recent call last):
File "distributed_bag_load_save.py", line 12, in
lc = LocalCluster(n_workers=1)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
self.start(ip=ip, n_workers=n_workers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
self.sync(self._start, kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
return sync(self.loop, func, args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
six.reraise(error[0])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(
self.worker_kwargs) for i in range(n_workers)]
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")tornado.util.TimeoutError: Worker failed to start
/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))

Here's the contents of the conda env that produced this error:

$ conda list

packages in environment at /Users/DPeterK/miniconda3/envs/dask:

#
blas 1.1 openblas conda-forge
bokeh 1.0.2 py37_1000 conda-forge
ca-certificates 2018.11.29 ha4d7672_0 conda-forge
certifi 2018.11.29 py37_1000 conda-forge
click 7.0 py_0 conda-forge
cloudpickle 0.6.1 py_0 conda-forge
cytoolz 0.9.0.1 py37h470a237_1 conda-forge
dask 1.0.0 py_0 conda-forge
dask-core 1.0.0 py_0 conda-forge
distributed 1.25.1 py37_1000 conda-forge
freetype 2.9.1 h6debe1e_4 conda-forge
heapdict 1.0.0 py37_1000 conda-forge
jinja2 2.10 py_1 conda-forge
jpeg 9c h470a237_1 conda-forge
libffi 3.2.1 hfc679d8_5 conda-forge
libgfortran 3.0.0 1 conda-forge
libpng 1.6.36 ha92aebf_0 conda-forge
libtiff 4.0.10 he6b73bb_1 conda-forge
locket 0.2.0 py_2 conda-forge
markupsafe 1.1.0 py37h470a237_0 conda-forge
msgpack-python 0.6.0 py37h2d50403_0 conda-forge
ncurses 6.1 hfc679d8_2 conda-forge
numpy 1.15.4 py37_blas_openblashb06ca3d_0 [blas_openblas] conda-forge
olefile 0.46 py_0 conda-forge
openblas 0.3.3 ha44fe06_1 conda-forge
openssl 1.0.2p h470a237_1 conda-forge
packaging 18.0 py_0 conda-forge
pandas 0.23.4 py37hf8a1672_0 conda-forge
partd 0.3.9 py_0 conda-forge
pillow 5.3.0 py37hc736899_0 conda-forge
pip 18.1 py37_1000 conda-forge
psutil 5.4.8 py37h470a237_0 conda-forge
pyparsing 2.3.0 py_0 conda-forge
python 3.7.1 h46c1a51_0 conda-forge
python-dateutil 2.7.5 py_0 conda-forge
pytz 2018.7 py_0 conda-forge
pyyaml 3.13 py37h470a237_1 conda-forge
readline 7.0 haf1bffa_1 conda-forge
setuptools 40.6.3 py37_0 conda-forge
six 1.12.0 py37_1000 conda-forge
sortedcontainers 2.1.0 py_0 conda-forge
sqlite 3.26.0 hb1c47c0_0 conda-forge
tblib 1.3.2 py_1 conda-forge
tk 8.6.9 ha92aebf_0 conda-forge
toolz 0.9.0 py_1 conda-forge
tornado 5.1.1 py37h470a237_0 conda-forge
wheel 0.32.3 py37_0 conda-forge
xz 5.2.4 h470a237_1 conda-forge
yaml 0.1.7 h470a237_1 conda-forge
zict 0.1.3 py_0 conda-forge
zlib 1.2.11 h470a237_3 conda-forge

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/dask/distributed/issues/2422, or mute the thread
https://github.com/notifications/unsubscribe-auth/AASszGVV0DTkXNAQXBr0Uoeac5Yfll4xks5u6MfrgaJpZM4ZYC-D
.

All 3 comments

This error usually occurs when you start a LocalCluster in a Python script
not under the if __name__ == '__main__': block. The same thing happens
if you create a multiprocessing.Pool outside of that block.

On Tue, Dec 18, 2018 at 5:48 AM Peter Killick notifications@github.com
wrote:

Whenever I try and start a LocalCluster the startup errors and I get a
lot of traceback. I can reproduce this on Linux (RHEL7) and macOS, in
many recent versions of dask and distributed and on Python 2 and 3
(although in Python 2 this just hangs rather than producing lots of
traceback).

I'm going to try and dig into this a bit.

It's possible this relates to #2376
https://github.com/dask/distributed/issues/2376 (the traceback looks
similar).

For example (setting up a LocalCluster with just the one worker as the
stacktrace below gets duplicated per worker):

from distributed import Client, LocalCluster

lc = LocalCluster(n_workers=1)

/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/bokeh/core.py:57: UserWarning:
Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
warnings.warn('n' + msg)
Traceback (most recent call last):
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 261, in main
old_handlers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/forkserver.py", line 297, in _serve_one
code = spawn._main(child_r)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 114, in _main
prepare(preparation_data)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/DPeterK/data/random_data_cubes/distributed_bag_load_save.py", line 12, in
lc = LocalCluster(n_workers=1)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
self.start(ip=ip, n_workers=n_workers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
self.sync(self._start, kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
return sync(self.loop, func, args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
six.reraise(error[0])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(
self.worker_kwargs) for i in range(n_workers)]
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 208, in _start_worker
yield w._start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 157, in _start
response = yield self.instantiate()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 226, in instantiate
self.process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/nanny.py", line 370, in start
yield self.process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 35, in _call_and_set_future
res = func(args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/process.py", line 184, in _start
process.start()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/context.py", line 291, in _Popen
return Popen(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 35, in __init__
super().__init__(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/popen_forkserver.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

distributed.nanny - WARNING - Worker process 14323 exited with status 1
Traceback (most recent call last):
File "distributed_bag_load_save.py", line 12, in
lc = LocalCluster(n_workers=1)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 141, in __init__
self.start(ip=ip, n_workers=n_workers)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 171, in start
self.sync(self._start, kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 164, in sync
return sync(self.loop, func, args, *kwargs)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 277, in sync
six.reraise(error[0])
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/utils.py", line 262, in f
result[0] = yield future
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(
exc_info)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 191, in _start
yield [self._start_worker(
self.worker_kwargs) for i in range(n_workers)]
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 883, in callback
result_list.append(f.result())
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/tornado/gen.py", line 1147, in run
yielded = self.gen.send(value)
File "/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/site-packages/distributed/deploy/local.py", line 217, in _start_worker
raise gen.TimeoutError("Worker failed to start")tornado.util.TimeoutError: Worker failed to start
/Users/DPeterK/miniconda3/envs/dask/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))

Here's the contents of the conda env that produced this error:

$ conda list

packages in environment at /Users/DPeterK/miniconda3/envs/dask:

#
blas 1.1 openblas conda-forge
bokeh 1.0.2 py37_1000 conda-forge
ca-certificates 2018.11.29 ha4d7672_0 conda-forge
certifi 2018.11.29 py37_1000 conda-forge
click 7.0 py_0 conda-forge
cloudpickle 0.6.1 py_0 conda-forge
cytoolz 0.9.0.1 py37h470a237_1 conda-forge
dask 1.0.0 py_0 conda-forge
dask-core 1.0.0 py_0 conda-forge
distributed 1.25.1 py37_1000 conda-forge
freetype 2.9.1 h6debe1e_4 conda-forge
heapdict 1.0.0 py37_1000 conda-forge
jinja2 2.10 py_1 conda-forge
jpeg 9c h470a237_1 conda-forge
libffi 3.2.1 hfc679d8_5 conda-forge
libgfortran 3.0.0 1 conda-forge
libpng 1.6.36 ha92aebf_0 conda-forge
libtiff 4.0.10 he6b73bb_1 conda-forge
locket 0.2.0 py_2 conda-forge
markupsafe 1.1.0 py37h470a237_0 conda-forge
msgpack-python 0.6.0 py37h2d50403_0 conda-forge
ncurses 6.1 hfc679d8_2 conda-forge
numpy 1.15.4 py37_blas_openblashb06ca3d_0 [blas_openblas] conda-forge
olefile 0.46 py_0 conda-forge
openblas 0.3.3 ha44fe06_1 conda-forge
openssl 1.0.2p h470a237_1 conda-forge
packaging 18.0 py_0 conda-forge
pandas 0.23.4 py37hf8a1672_0 conda-forge
partd 0.3.9 py_0 conda-forge
pillow 5.3.0 py37hc736899_0 conda-forge
pip 18.1 py37_1000 conda-forge
psutil 5.4.8 py37h470a237_0 conda-forge
pyparsing 2.3.0 py_0 conda-forge
python 3.7.1 h46c1a51_0 conda-forge
python-dateutil 2.7.5 py_0 conda-forge
pytz 2018.7 py_0 conda-forge
pyyaml 3.13 py37h470a237_1 conda-forge
readline 7.0 haf1bffa_1 conda-forge
setuptools 40.6.3 py37_0 conda-forge
six 1.12.0 py37_1000 conda-forge
sortedcontainers 2.1.0 py_0 conda-forge
sqlite 3.26.0 hb1c47c0_0 conda-forge
tblib 1.3.2 py_1 conda-forge
tk 8.6.9 ha92aebf_0 conda-forge
toolz 0.9.0 py_1 conda-forge
tornado 5.1.1 py37h470a237_0 conda-forge
wheel 0.32.3 py37_0 conda-forge
xz 5.2.4 h470a237_1 conda-forge
yaml 0.1.7 h470a237_1 conda-forge
zict 0.1.3 py_0 conda-forge
zlib 1.2.11 h470a237_3 conda-forge

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/dask/distributed/issues/2422, or mute the thread
https://github.com/notifications/unsubscribe-auth/AASszGVV0DTkXNAQXBr0Uoeac5Yfll4xks5u6MfrgaJpZM4ZYC-D
.

That did it – thanks @mrocklin. Given that I had no idea about this (or had forgotten it), is there value in a note in the Local Cluster docs to this effect?

I had same issue, I'm new to python. I implemented everything in Jupyter notebook, which is in my repository and it's being executed periodically from that repository as a batch job. It is nbconvert-ed on into normal python script and Dask was not working from that moment (I had to do insert that "if __name__ " block using sed into script in order to work). It would be nice to have that in Dask's docs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mrocklin picture mrocklin  Â·  3Comments

muammar picture muammar  Â·  6Comments

mrocklin picture mrocklin  Â·  6Comments

wmlba picture wmlba  Â·  4Comments

djhoese picture djhoese  Â·  3Comments