Ray: [Dashboard] OSError: [Errno 99] error while attempting to bind on address ('::1', 8265, 0, 0): cannot assign requested address

Created on 7 Feb 2020 · 14Comments · Source: ray-project/ray

What is the problem?

I am building a Docker image with my branch and am unable to start the dashboard. Node 13.x is installed. The issue appears to be a port conflict. Perhaps there is something already listening on 8265?

$ docker logs -f rl-actor
[ray] Forcing OMP_NUM_THREADS=1 to avoid performance degradation with many workers (issue #6998). You can override this by explicitly setting OMP_NUM_THREADS.
/opt/conda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
2020-02-07 15:50:54,893 WARNING services.py:592 -- setpgrp failed, processes may not be cleaned up properly: [Errno 1] Operation not permitted.
2020-02-07 15:50:54,894 INFO resource_spec.py:212 -- Starting Ray with 35.25 GiB memory available for workers and up to 17.64 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-02-07 15:50:55,481 INFO services.py:1093 -- View the Ray dashboard at localhost:8265
2020-02-07 15:50:58,493 WARNING worker.py:1071 -- The dashboard on node c9ba97c06401 failed with the following error:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 1045, in create_server
    sock.bind(sa)
OSError: [Errno 99] Cannot assign requested address

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ray/python/ray/dashboard/dashboard.py", line 760, in <module>
    dashboard.run()
  File "/ray/python/ray/dashboard/dashboard.py", line 335, in run
    aiohttp.web.run_app(self.app, host=self.host, port=self.port)
  File "/opt/conda/lib/python3.6/site-packages/aiohttp/web.py", line 433, in run_app
    reuse_port=reuse_port))
  File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
    return future.result()
  File "/opt/conda/lib/python3.6/site-packages/aiohttp/web.py", line 359, in _run_app
    await site.start()
  File "/opt/conda/lib/python3.6/site-packages/aiohttp/web_runner.py", line 104, in start
    reuse_port=self._reuse_port)
  File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 1049, in create_server
    % (sa, err.strerror.lower()))
OSError: [Errno 99] error while attempting to bind on address ('::1', 8265, 0, 0): cannot assign requested address

Reproduction (REQUIRED)

Here is the Dockerfile I'm using, which is based off base-deps:

FROM tensorflow/tensorflow:nightly-gpu-py3
# install ray dependencies
RUN apt-get update \
    && apt-get install -y \
        curl \
        tmux \
        screen \
        rsync \
        apt-transport-https \
        zlib1g-dev \
        libgl1-mesa-dev \
        git \
        wget \
        cmake \
        build-essential \
        curl \
        unzip \
    && apt-get clean \
    && echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh \
    && wget \
        --quiet 'https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-x86_64.sh' \
        -O /tmp/anaconda.sh \
    && /bin/bash /tmp/anaconda.sh -b -p /opt/conda \
    && rm /tmp/anaconda.sh \
    && /opt/conda/bin/conda install -y \
        libgcc \
    && /opt/conda/bin/conda clean -y --all \
    && /opt/conda/bin/pip install \
        flatbuffers \
        cython==0.29.0 \
        numpy==1.15.4
ENV PATH "/opt/conda/bin:$PATH"
RUN conda remove -y --force wrapt
RUN pip install -U pip
# To avoid the following error on Jenkins:
# AttributeError: 'numpy.ufunc' object has no attribute '__module__'
RUN /opt/conda/bin/pip uninstall -y dask
ENV PATH "/opt/conda/bin:$PATH"
# For Click
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
RUN pip install gym[atari]==0.10.11 opencv-python-headless lz4 pytest-timeout smart_open torch torchvision
RUN pip install --upgrade bayesian-optimization
RUN pip install --upgrade hyperopt==0.1.2
RUN pip install ConfigSpace==0.4.10
RUN pip install --upgrade sigopt nevergrad scikit-optimize hpbandster lightgbm xgboost tensorboardX
RUN pip install -U mlflow
RUN pip install -U pytest-remotedata>=0.3.1

# install custom ray branch
RUN git clone --single-branch --branch warmstart2 https://github.com/thavlik/ray.git
RUN ray/ci/travis/install-bazel.sh
WORKDIR /ray/python
RUN pip install -U -e . --verbose
RUN python ray/setup-dev.py --yes

# install node and build dashboard
RUN curl -sL https://deb.nodesource.com/setup_13.x | bash -
RUN apt-get install -y nodejs
RUN cd ray/dashboard/client && npm ci && npm run build

# install dependencies for my python project
RUN pip install tqdm==4.41.1 \
    tensorflow-gpu==2.1.0 \
    tensorboard==2.1.0 \
    Keras==2.3.1 \
    absl-py==0.9.0 \
    boto3==1.11.1 \
    psutil==5.6.7 \
    gym==0.15.4 \
    GPUtil==1.4.0 \
    opencv-python==4.1.2.30 \
    lz4==3.0.2 \
    setproctitle==1.1.10 \
    tensorboardX==2.0

Running any tune experiment produces the warning.

bug

Source

thavlik

👍5

Most helpful comment

I had the exact same error. Solved it by adding

ray.init(webui_host='127.0.0.1') at the beginning of the python file.

It seems like hostname '::1' or 'localhost' are sometimes not recognized.

semin-park on 11 Feb 2020

👍19 🎉9 👎1

All 14 comments

I highly doubt that it is a port problem because dashboard will increase a port number if it is not available before it runs a dashboard process. (For example, if the port is already used, it increases a number to be 8266). There could be many factors that can cause OSError: [Errno 99] Cannot assign requested address, but I assume it is related to how Docker. @wuisawesome Any thought?

rkooo567 on 9 Feb 2020

👍1

Does adding --webui-host 0.0.0.0 to ray start work to mitigate this?

wuisawesome on 9 Feb 2020

👍4

Does adding --webui-host 0.0.0.0 to ray start work to mitigate this?

I am not using ray start - this is with the tune.run API.

thavlik on 10 Feb 2020

I had the exact same error. Solved it by adding

ray.init(webui_host='127.0.0.1') at the beginning of the python file.

It seems like hostname '::1' or 'localhost' are sometimes not recognized.

semin-park on 11 Feb 2020

👍19 🎉9 👎1

I had the exact same error. Solved it by adding

ray.init(webui_host='127.0.0.1') at the beginning of the python file.

It seems like hostname '::1' or 'localhost' are sometimes not recognized.

This fixes the issue for me as well. Thank you.

thavlik on 11 Feb 2020

For reference, this issue appears to be described in more detail here: https://github.com/aio-libs/aiohttp/issues/4554

wuisawesome on 11 Feb 2020

Should this be the default?

mjlbach on 12 Feb 2020

I had the same issue. I'm using the ray command line API, and adding --webui-host 0.0.0.0 works for me!

LinxiFan on 11 Aug 2020

Could the next person who runs into this issue please post the output of cat /etc/hosts | grep localhost and share your OS as well? It would be very useful for understanding how widespread link local ipv6 addresses are for Ray users.

wuisawesome on 12 Aug 2020

👍1

I'm in a Docker container.

host OS: macOS Big Sur Beta (20A5343i)
container OS: debian buster (3.8-slim-buster)

127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback

sumanthratna on 12 Aug 2020

I get the same error in an ubuntu 18.04 container running in jupyterhub on kubernetes. The output of cat /etc/hosts | grep localhost is also

127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback

The error went away for when setting ray.init(dashboard_host="127.0.0.1"). I believe the argument name was changed since @semin-park's answer.

JarnoRFB on 18 Aug 2020

👍4

I had the exact same error. Solved it by adding

ray.init(webui_host='127.0.0.1') at the beginning of the python file.

It seems like hostname '::1' or 'localhost' are sometimes not recognized.

I tried to follow your steps but got the following errors

TypeError: init() got an unexpected keyword argument 'webui_host'