I am building a Docker image with my branch and am unable to start the dashboard. Node 13.x is installed. The issue appears to be a port conflict. Perhaps there is something already listening on 8265?
$ docker logs -f rl-actor
[ray] Forcing OMP_NUM_THREADS=1 to avoid performance degradation with many workers (issue #6998). You can override this by explicitly setting OMP_NUM_THREADS.
/opt/conda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
2020-02-07 15:50:54,893 WARNING services.py:592 -- setpgrp failed, processes may not be cleaned up properly: [Errno 1] Operation not permitted.
2020-02-07 15:50:54,894 INFO resource_spec.py:212 -- Starting Ray with 35.25 GiB memory available for workers and up to 17.64 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-02-07 15:50:55,481 INFO services.py:1093 -- View the Ray dashboard at localhost:8265
2020-02-07 15:50:58,493 WARNING worker.py:1071 -- The dashboard on node c9ba97c06401 failed with the following error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 1045, in create_server
sock.bind(sa)
OSError: [Errno 99] Cannot assign requested address
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ray/python/ray/dashboard/dashboard.py", line 760, in <module>
dashboard.run()
File "/ray/python/ray/dashboard/dashboard.py", line 335, in run
aiohttp.web.run_app(self.app, host=self.host, port=self.port)
File "/opt/conda/lib/python3.6/site-packages/aiohttp/web.py", line 433, in run_app
reuse_port=reuse_port))
File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.6/site-packages/aiohttp/web.py", line 359, in _run_app
await site.start()
File "/opt/conda/lib/python3.6/site-packages/aiohttp/web_runner.py", line 104, in start
reuse_port=self._reuse_port)
File "/opt/conda/lib/python3.6/asyncio/base_events.py", line 1049, in create_server
% (sa, err.strerror.lower()))
OSError: [Errno 99] error while attempting to bind on address ('::1', 8265, 0, 0): cannot assign requested address
Here is the Dockerfile I'm using, which is based off base-deps:
FROM tensorflow/tensorflow:nightly-gpu-py3
# install ray dependencies
RUN apt-get update \
&& apt-get install -y \
curl \
tmux \
screen \
rsync \
apt-transport-https \
zlib1g-dev \
libgl1-mesa-dev \
git \
wget \
cmake \
build-essential \
curl \
unzip \
&& apt-get clean \
&& echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh \
&& wget \
--quiet 'https://repo.continuum.io/archive/Anaconda3-5.2.0-Linux-x86_64.sh' \
-O /tmp/anaconda.sh \
&& /bin/bash /tmp/anaconda.sh -b -p /opt/conda \
&& rm /tmp/anaconda.sh \
&& /opt/conda/bin/conda install -y \
libgcc \
&& /opt/conda/bin/conda clean -y --all \
&& /opt/conda/bin/pip install \
flatbuffers \
cython==0.29.0 \
numpy==1.15.4
ENV PATH "/opt/conda/bin:$PATH"
RUN conda remove -y --force wrapt
RUN pip install -U pip
# To avoid the following error on Jenkins:
# AttributeError: 'numpy.ufunc' object has no attribute '__module__'
RUN /opt/conda/bin/pip uninstall -y dask
ENV PATH "/opt/conda/bin:$PATH"
# For Click
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
RUN pip install gym[atari]==0.10.11 opencv-python-headless lz4 pytest-timeout smart_open torch torchvision
RUN pip install --upgrade bayesian-optimization
RUN pip install --upgrade hyperopt==0.1.2
RUN pip install ConfigSpace==0.4.10
RUN pip install --upgrade sigopt nevergrad scikit-optimize hpbandster lightgbm xgboost tensorboardX
RUN pip install -U mlflow
RUN pip install -U pytest-remotedata>=0.3.1
# install custom ray branch
RUN git clone --single-branch --branch warmstart2 https://github.com/thavlik/ray.git
RUN ray/ci/travis/install-bazel.sh
WORKDIR /ray/python
RUN pip install -U -e . --verbose
RUN python ray/setup-dev.py --yes
# install node and build dashboard
RUN curl -sL https://deb.nodesource.com/setup_13.x | bash -
RUN apt-get install -y nodejs
RUN cd ray/dashboard/client && npm ci && npm run build
# install dependencies for my python project
RUN pip install tqdm==4.41.1 \
tensorflow-gpu==2.1.0 \
tensorboard==2.1.0 \
Keras==2.3.1 \
absl-py==0.9.0 \
boto3==1.11.1 \
psutil==5.6.7 \
gym==0.15.4 \
GPUtil==1.4.0 \
opencv-python==4.1.2.30 \
lz4==3.0.2 \
setproctitle==1.1.10 \
tensorboardX==2.0
Running any tune experiment produces the warning.
I highly doubt that it is a port problem because dashboard will increase a port number if it is not available before it runs a dashboard process. (For example, if the port is already used, it increases a number to be 8266). There could be many factors that can cause OSError: [Errno 99] Cannot assign requested address, but I assume it is related to how Docker. @wuisawesome Any thought?
Does adding --webui-host 0.0.0.0 to ray start work to mitigate this?
Does adding
--webui-host 0.0.0.0toray startwork to mitigate this?
I am not using ray start - this is with the tune.run API.
I had the exact same error. Solved it by adding
ray.init(webui_host='127.0.0.1') at the beginning of the python file.
It seems like hostname '::1' or 'localhost' are sometimes not recognized.
I had the exact same error. Solved it by adding
ray.init(webui_host='127.0.0.1')at the beginning of the python file.It seems like hostname '::1' or 'localhost' are sometimes not recognized.
This fixes the issue for me as well. Thank you.
For reference, this issue appears to be described in more detail here: https://github.com/aio-libs/aiohttp/issues/4554
Should this be the default?
I had the same issue. I'm using the ray command line API, and adding --webui-host 0.0.0.0 works for me!
Could the next person who runs into this issue please post the output of cat /etc/hosts | grep localhost and share your OS as well? It would be very useful for understanding how widespread link local ipv6 addresses are for Ray users.
I'm in a Docker container.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
I get the same error in an ubuntu 18.04 container running in jupyterhub on kubernetes. The output of cat /etc/hosts | grep localhost is also
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
The error went away for when setting ray.init(dashboard_host="127.0.0.1"). I believe the argument name was changed since @semin-park's answer.
I had the exact same error. Solved it by adding
ray.init(webui_host='127.0.0.1')at the beginning of the python file.It seems like hostname '::1' or 'localhost' are sometimes not recognized.
I tried to follow your steps but got the following errors
TypeError: init() got an unexpected keyword argument 'webui_host'
The argument name was changed to ray.init(dashboard_host="127.0.0.1")
The argument name was changed to
ray.init(dashboard_host="127.0.0.1")
it works. thanks.
Most helpful comment
I had the exact same error. Solved it by adding
ray.init(webui_host='127.0.0.1')at the beginning of the python file.It seems like hostname '::1' or 'localhost' are sometimes not recognized.