Ray: Unable to start ray dashboard on red hat

Created on 21 Aug 2020  路  32Comments  路  Source: ray-project/ray

OS:

Red Hat Enterprise Linux Workstation release 7.6 (Maipo)

pip list gives us this

Package Version


aiohttp 3.6.2
aioredis 1.3.1
alabaster 0.7.12
async-timeout 3.0.1
asyncio 3.4.3
atomicwrites 1.4.0
attrs 19.3.0
autograd 1.3
Babel 2.8.0
beautifulsoup4 4.9.1
blessings 1.7
cachetools 4.1.1
certifi 2020.6.20
chardet 3.0.4
click 7.1.2
cma 2.7.0
colorama 0.4.3
colorful 0.5.4
contextvars 2.4
cycler 0.10.0
dill 0.3.2
docutils 0.16
filelock 3.0.12
future 0.18.2
google 3.0.0
google-api-core 1.22.1
google-auth 1.20.1
googleapis-common-protos 1.52.0
gpustat 0.6.0
grpcio 1.30.0
hiredis 1.1.0
idna 2.10
idna-ssl 1.1.0
imagesize 1.2.0
immutables 0.14
importlib-metadata 1.7.0
iniconfig 1.0.1
Jinja2 2.11.2
joblib 0.16.0
jsonschema 3.2.0
kiwisolver 1.2.0
MarkupSafe 1.1.1
matplotlib 3.3.0
more-itertools 8.4.0
msgpack 1.0.0
multidict 4.7.6
numpy 1.19.1
numpydoc 1.1.0
nvidia-ml-py3 7.352.0
opencensus 0.7.10
opencensus-context 0.1.1
packaging 20.4
pandas 1.1.0
Pillow 7.2.0
pip 20.2.2
pip2pi 0.8.1
pluggy 0.13.1
POAP 0.1.26
prometheus-client 0.8.0
protobuf 3.12.4
psutil 5.7.2
py 1.9.0
py-spy 0.3.3
pyasn1 0.4.8
pyasn1-modules 0.2.8
pyDOE2 1.3.0
Pygments 2.6.1
pymoo 0.4.1
pyparsing 2.4.7
pyrsistent 0.16.0
pySOT 0.3.3
pytest 6.0.1
python-dateutil 2.8.1
pytz 2020.1
PyYAML 5.3.1
ray 0.8.6
redis 3.4.1
requests 2.24.0
rsa 4.6
SALib 1.3.11
scikit-learn 0.23.1
scipy 1.5.2
setuptools 49.6.0
six 1.15.0
snowballstemmer 2.0.0
soupsieve 2.0.1
Sphinx 3.1.2
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 1.0.3
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.4
threadpoolctl 2.1.0
toml 0.10.1
typing-extensions 3.7.4.2
urllib3 1.25.10
wheel 0.35.1
yarl 1.5.1
zipp 3.1.0

What is your question?

### Question 1
What is the standard linux OS / python version which everyone is using which can run ray simply by pip install and ray start --head . Please provide a working setup details which I can replicate at my end which shall ensure things work fine .

### We are unable to start ray dashboard . On checking the logs

dashboard.err has the information like this which shows that dashboard is not able to bind to an address

I0821 15:41:40.269484 70819 70819 service_based_accessor.cc:1248] Reestablishing subscription for object locations.
I0821 15:41:40.269490 70819 70819 service_based_accessor.cc:1368] Reestablishing subscription for worker failures.
I0821 15:41:40.269497 70819 70819 service_based_gcs_client.cc:86] ServiceBasedGcsClient Connected.
2020-08-21 15:41:40,336 INFO node_stats.py:172 -- NodeStats: subscribed to RAY_REPORTER.*
2020-08-21 15:41:40,336 INFO node_stats.py:176 -- NodeStats: subscribed to RAY_LOG_CHANNEL
2020-08-21 15:41:40,336 INFO node_stats.py:180 -- NodeStats: subscribed to 9
2020-08-21 15:41:40,337 INFO node_stats.py:184 -- NodeStats: subscribed to b'ACTOR:*'
Traceback (most recent call last):
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 1062, in create_server
sock.bind(sa)
OSError: getsockaddrarg: bad family

During handling of the above exception, another exception occurred:
0821 15:41:40.269490 70819 70819 service_based_accessor.cc:1368] Reestablishing subscription for worker failures.
I0821 15:41:40.269497 70819 70819 service_based_gcs_client.cc:86] ServiceBasedGcsClient Connected.
2020-08-21 15:41:40,336 INFO node_stats.py:172 -- NodeStats: subscribed to RAY_REPORTER.*
2020-08-21 15:41:40,336 INFO node_stats.py:176 -- NodeStats: subscribed to RAY_LOG_CHANNEL
2020-08-21 15:41:40,336 INFO node_stats.py:180 -- NodeStats: subscribed to 9
2020-08-21 15:41:40,337 INFO node_stats.py:184 -- NodeStats: subscribed to b'ACTOR:*'
Traceback (most recent call last):
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 1062, in create_server
sock.bind(sa)
OSError: getsockaddrarg: bad family

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/ray/dashboard/dashboard.py", line 974, in
raise e
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/ray/dashboard/dashboard.py", line 961, in
dashboard.run()
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/ray/dashboard/dashboard.py", line 576, in run
aiohttp.web.run_app(self.app, host=self.host, port=self.port)
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/aiohttp/web.py", line 433, in run_app
reuse_port=reuse_port))
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/aiohttp/web.py", line 359, in _run_app
await site.start()
File "/home/manishagarwal/RAYBASIC/lib/python3.6/site-packages/aiohttp/web_runner.py", line 104, in start
reuse_port=self._reuse_port)
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 1066, in create_server
% (sa, err.strerror.lower()))
AttributeError: 'NoneType' object has no attribute 'lower'

question

All 32 comments

@robertnishihara others please help me as I am stuck in starting this . I fear may be red hat is not the appropriate OS. Please help me with right configuration so that I can start off please .

https://github.com/majek/puka/issues/27

Can you check if this issue is related?

@rkooo567 i do not see puka beign used my machine . I did pip install puka but i dont think the version i have uses that .

Can you kindly share your machine detils like OS and pip list output please .

I don't see any reason why it wouldn't work in RedHat, but here is the Linux version information that dashboard is working.

cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Ray officially supports Python 3.6, 7, and 8.

I personally don't think the issue is related to the version of your Python dependencies. It seems like the issue is that the dashboard process cannot be bind to localhost:8265 address.

The related issue says his Python disables ipv6, and that was the cause of his issue. I'd like you to check if it is the case for you too.

@rkooo567 1) i have changed the port number as well but its still not help . 2) i am on a red hat virtual machine which is not having ipv6 address so even if it disables ipv6 i dont think that should be an issue . 3) I tried on a full machine as well but dashboard didnt came up on a full machine as well .

@rkooo567 can you provide your "pip list" output . SInce red hat is not a suspect may be i will use your pip list and use all those in a python virtual environment

Sure. Just note that I am running it with pip install -e ., but it should have dependencies that work. It might have unrelated deps too.

Package                       Version     Location
----------------------------- ----------- ---------------------------------
absl-py                       0.9.0
aiohttp                       3.6.2
aiohttp-cors                  0.7.0
aioredis                      1.3.1
alabaster                     0.7.12
amqp                          2.5.2
appdirs                       1.4.3
appnope                       0.1.0
astor                         0.8.1
astunparse                    1.6.3
async-timeout                 3.0.1
asyncpg                       0.20.1
atari-py                      0.2.6
attrs                         19.3.0
awscli                        1.18.6
Babel                         2.8.0
backcall                      0.1.0
beautifulsoup4                4.8.2
billiard                      3.6.3.0
black                         19.10b0
bleach                        3.1.5
blessings                     1.7
blist                         1.3.6
boto3                         1.12.6
botocore                      1.15.6
cachetools                    4.0.0
celery                        4.4.2
certifi                       2019.11.28
chardet                       3.0.4
Click                         7.0
cloudpickle                   1.3.0
colorama                      0.4.3
colorful                      0.5.4
commonmark                    0.8.1
coverage                      5.1
dataclasses                   0.6
decorator                     4.4.1
Deprecated                    1.2.9
dm-tree                       0.1.5
docker                        4.2.0
docutils                      0.14
entrypoints                   0.3
fastapi                       0.54.1
filelock                      3.0.12
flake8                        3.7.7
flake8-quotes                 2.1.1
Flask                         1.1.1
flatbuffers                   1.12
flynt                         0.52
funcsigs                      1.0.2
future                        0.18.2
gast                          0.3.3
gitdb                         4.0.2
GitPython                     3.1.0
google                        2.0.3
google-api-core               1.21.0
google-auth                   1.19.0
google-auth-oauthlib          0.4.1
google-pasta                  0.2.0
googleapis-common-protos      1.52.0
gpustat                       0.6.0
grpcio                        1.31.0
gym                           0.17.1
h11                           0.9.0
h2                            3.2.0
h5py                          2.10.0
hiredis                       1.1.0
hpack                         3.0.0
hstspreload                   2020.3.25
httptools                     0.1.1
httpx                         0.12.1
hyperframe                    5.2.0
hyperopt                      0.2.3
idna                          2.9
imagesize                     1.2.0
importlib-metadata            1.5.0
ipython                       7.12.0
ipython-genutils              0.2.0
itsdangerous                  1.1.0
jedi                          0.16.0
Jinja2                        2.11.1
jmespath                      0.9.5
joblib                        0.16.0
jsonpatch                     1.25
jsonpointer                   2.0
jsonschema                    3.2.0
Keras-Preprocessing           1.1.2
keyring                       21.2.1
kombu                         4.6.8
lz4                           3.1.0
Markdown                      3.2.1
MarkupSafe                    1.1.1
mccabe                        0.6.1
mock                          1.0.1
more-itertools                8.2.0
mpmath                        1.1.0
msgpack                       1.0.0
multidict                     4.7.5
mypy                          0.782
mypy-extensions               0.4.3
networkx                      2.2
numpy                         1.18.1
nvidia-ml-py3                 7.352.0
oauthlib                      3.1.0
opencensus                    0.7.10
opencensus-context            0.1.1
opencensus-ext-prometheus     0.2.1
opencv-python                 4.2.0.34
opencv-python-headless        4.2.0.34
opentelemetry-api             0.10b0
opentelemetry-ext-prometheus  0.10b0
opentelemetry-sdk             0.10b0
opt-einsum                    3.2.1
packaging                     20.1
pandas                        1.0.1
parameterized                 0.7.4
parso                         0.6.1
pathspec                      0.7.0
pbr                           5.4.5
pexpect                       4.8.0
pickle5                       0.0.10
pickleshare                   0.7.5
Pillow                        5.4.1
pip                           20.2.2
pkginfo                       1.5.0.1
pluggy                        0.13.1
prometheus-client             0.8.0
prompt-toolkit                3.0.3
protobuf                      3.12.0
psutil                        5.7.0
psycopg2-binary               2.8.5
ptyprocess                    0.6.0
py                            1.8.1
py-spy                        0.3.3
pyaml                         20.4.0
pyarrow                       0.17.0
pyasn1                        0.4.8
pyasn1-modules                0.2.8
pycodestyle                   2.5.0
pydantic                      1.4
pyflakes                      2.1.1
PyGithub                      1.51
pyglet                        1.5.0
Pygments                      2.6.1
PyJWT                         1.7.1
pyparsing                     2.4.6
pyrsistent                    0.15.7
pytest                        5.3.5
pytest-aiohttp                0.3.0
pytest-asyncio                0.10.0
pytest-cov                    2.8.1
pytest-mock                   3.0.0
pytest-timeout                1.3.4
python-dateutil               2.8.1
pytz                          2019.3
PyYAML                        5.2
ray                           0.9.0.dev0  /Users/sangbincho/work/ray/python
readme-renderer               26.0
readthedocs-sphinx-ext        1.0.4
recommonmark                  0.5.0
redis                         3.4.1
regex                         2020.2.20
requests                      2.23.0
requests-oauthlib             1.3.0
requests-toolbelt             0.9.1
rfc3986                       1.3.2
rsa                           3.4.2
s3transfer                    0.3.3
scikit-learn                  0.23.1
scikit-optimize               0.7.4
scipy                         1.4.1
setuptools                    41.0.1
six                           1.14.0
sklearn                       0.0
smmap                         3.0.1
sniffio                       1.1.0
snowballstemmer               2.0.0
soupsieve                     2.0
Sphinx                        1.8.5
sphinx-click                  2.3.2
sphinx-copybutton             0.2.11
sphinx-gallery                0.7.0
sphinx-jsonschema             1.15
sphinx-rtd-theme              0.4.3
sphinx-tabs                   1.1.13
sphinx-version-warning        1.1.2
sphinxcontrib-applehelp       1.0.2
sphinxcontrib-devhelp         1.0.2
sphinxcontrib-htmlhelp        1.0.3
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          1.0.3
sphinxcontrib-serializinghtml 1.1.4
sphinxcontrib-websupport      1.2.3
sphinxemoji                   0.1.6
SQLAlchemy                    1.3.16
sqlalchemy-stubs              0.3
starlette                     0.13.2
style                         1.1.0
tabulate                      0.8.6
tensorboard                   2.2.0
tensorboard-plugin-wit        1.6.0.post2
tensorboardX                  2.1
tensorflow                    2.2.0
tensorflow-estimator          2.2.0
termcolor                     1.1.0
threadpoolctl                 2.1.0
toml                          0.10.0
torch                         1.4.0
torchvision                   0.5.0
tqdm                          4.38.0
traitlets                     4.3.3
tune-sklearn                  0.0.5
twine                         3.1.1
typed-ast                     1.4.1
typing-extensions             3.7.4.2
update                        0.0.1
urllib3                       1.25.8
uvicorn                       0.11.3
uvloop                        0.14.0
vine                          1.3.0
wcwidth                       0.1.8
webencodings                  0.5.1
websocket-client              0.57.0
websockets                    8.1
Werkzeug                      1.0.0
wheel                         0.34.2
wrapt                         1.12.1
yapf                          0.23.0
yarl                          1.4.2
zipp                          3.0.0

Would you mind trying if you can just start aiohttp server in the localhost:8265? That's all dashboard process is doing to start.

Try with this doc (https://docs.aiohttp.org/en/stable/web_quickstart.html)

@rkooo567 python -m aiohttp.web -H localhost -P 8265 package.module:init_func
aiohttp.web: error: unable to import package.module: No module named 'package'

Were is this init_func or any other function defined which provides where will the / call go

Can you just try this?

from aiohttp import web

async def hello(request):
    return web.Response(text="Hello, world")

app = web.Application()
app.add_routes([web.get('/', hello)])

web.run_app(app, host='localhost', port=8265)

@rkooo567 it gives the same error
[RAYVIRTUAL] /home/manishagarwal>python /home/manishagarwal/testserver.py
Traceback (most recent call last):
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 1062, in create_server
sock.bind(sa)
OSError: getsockaddrarg: bad family

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/manishagarwal/testserver.py", line 9, in
web.run_app(app, host='localhost', port=8265)
File "/home/manishagarwal/RAYVIRTUAL/lib/python3.6/site-packages/aiohttp/web.py", line 433, in run_app
reuse_port=reuse_port))
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/home/manishagarwal/RAYVIRTUAL/lib/python3.6/site-packages/aiohttp/web.py", line 359, in _run_app
await site.start()
File "/home/manishagarwal/RAYVIRTUAL/lib/python3.6/site-packages/aiohttp/web_runner.py", line 104, in start
reuse_port=self._reuse_port)
File "/grid/common/pkgs/python/v3.6.8/lib/python3.6/asyncio/base_events.py", line 1066, in create_server
% (sa, err.strerror.lower()))
AttributeError: 'NoneType' object has no attribute 'lower'

Yeah then it definitely must be issues from address binding (Outside Ray). Aiohttp uses asyncio methods to bind the process to addresses, so it is highly likely related to some machine setup, but I am not sure what鈥檚 the cause

thanks @rkooo567 . Aiohttp is just an Asynchronous HTTP Client/Server . Can i replace it with someother option (like tornado / node ) or is there some configuration available through which i can change this . It will help me get it resolved ??

Unfortunately, Ray dashboard is tightly coupled with aiohttp, and it is not possible to switch it to use others. I'd rather recommend you to figure out why you cannot start aiohttp server on your machine if possible.

@rkooo567 i changed localhost to the specific ipv4 address (of my red hat machine) in the testerver.py that we created and it started the small server we created . To me it seems it is trying ipV6 address which is absent since this is a virtual machine having no ipv6 address . can i specify my ipv4 address in ths command /home/manishagarwal/VIRTRAY/bin/ray start --head

That's a good news! Yes it should be possible. Check out this flag (https://docs.ray.io/en/latest/package-ref.html#cmdoption-ray-start-dashboard-host). I think you should specify 0.0.0.0

(This might not be available in 0.8.6, but I am not sure.)

Usage: ray start [OPTIONS]
Try 'ray start --help' for help.

Error: no such option: --dashboard-host

from which version is this available ?

Can you try https://docs.ray.io/en/latest/package-ref.html#cmdoption-ray-start-webui-host? I think that option is available from 0.8.7 (and your version seems to be 0.8.6)

@rkooo567 thanks . --webui-host works . This helps me a lot . Thanks :) . Should i close it or you would like to keep this open to get it documented ?

Late to the party, but I believe this is related to #7084. This issue seems to be very common, and we should come up with a permanent fix.

Would setting the default 0.0.0.0 work?

yes, it should (we could also mitigate this by defaulting to the ipv4 loopback interface directly instead of localhost. i want to explore upstreaming a permanent patch (i think it's safe to bump up our version of aiohttp?)

I think the issue is from asyncio, not aiohttp? We can try, but it is probably just safe to use 127.0.0.1. I will also check if we have any process that uses localhost.

ah this might be a different issue then (the other issue i mentioned is a known issue in aiohttp)

Hmm I see. I think the linked issue seems to be the same issue as this (but not 100% sure. I think the issue is a asyncio method create_server translates localhost to ipv6 address and stop looking up there). Do you know which version of aiohttp resolves the issue?

We can probably ask @manishagarwal23 to try a higher version of aiohttp (though I am not sure if it promises a backward compatibility).

Do you know which version of aiohttp resolves the issue?

Unfortunately it's not resolved. There's an abandoned PR to fix the issue. Hopefully we can just take it over and upstream a patch.

In that case, I think just using ipv4 loopback address seems to be much cleaner solution than patching abandoned PRs. Let me create a PR soon.

@manishagarwal23 Would you mind quickly checking if webui-host="127.0.0.1" resolves the issue?

@rkooo567 @wuisawesome

1) first i did as was discussed yerday
/home/manishagarwal/VIRTRAY/workArea/manishSamples>sudo /home/manishagarwal/VIRTRAY/bin/ray start --head --webui-host 172.23.88.147

I ran the above command on linux and saw the dash board on windows . I was able to view the dashboard http://172.23.88.147:8265/ on windows properly and it was properly updated .

2) Second i did
i ran this command on linux
/home/manishagarwal/VIRTRAY/workArea/manishSamples>sudo /home/manishagarwal/VIRTRAY/bin/ray start --head --webui-host 127.0.0.1
It started but when i try to look at the dashboard http://172.23.88.147:8265/ from the windows machine it did not came up .

Yes, this is expected behavior because 127.0.0.1 says to bind to the loopback interface which means you are only listening to connections from the local host. Binding to your ip address is also acceptable to listen for connections from the network (on that interface), or more generically you could bind to 0.0.0.0

Was this page helpful?
0 / 5 - 0 ratings