Ray: Ray node connects to random redis port

Created on 19 Nov 2020  路  3Comments  路  Source: ray-project/ray

Hello.

Python version: 3.6.8
Ray version: 1.0.1

What is the problem?

Node can't provide workers even successfully connects to head.

I have two Centos machines, first is head and the second connects to head.

redis runs on head machine on port 6379

First machine : ray start --head --port=6380 --dashboard-host=0.0.0.0
Second machine: ray start --address='161.97.xx.xxx:6380' --redis-password='5241590000000000'

The first node successfully starts with all workers.
The second node connects successfully but displays in timeline, and in dashboard on PID column: "0 workers / 4 cores":
...
W1119 08:22:22.445783 1513663 1513663 redis_context.cc:303] Failed to connect to Redis, retrying.
F1119 08:22:23.602082 1513663 1513663 redis_context.cc:298] Could not establish connection to redis 161.97.xx.xxx:12662 (context.err = 1)

I have no idea why the node tries to connect on redis port 12662 (seeme to be randoom), I also need to mention that remote redis connection works fine and i can successfully connect to the node from the second node to first node.

Can you please help me?
Thank you very much for your support.

bug triage

Most helpful comment

Thank you very much guys, you've helped me a lot!
(@dHannasch very good explanation in your commit)
Regards,
Alex.

All 3 comments

Did you make sure to open all ports specified here? https://docs.ray.io/en/latest/configure.html#ports-configurations

Did you make sure to open all ports specified here? https://docs.ray.io/en/latest/configure.html#ports-configurations

Very usefull hint!
I've closed firewall on both machines and everithing works fine!

however there are something that i can't understand if want to start firewalld service.
--min-worker-port: Minimum port number worker can be bound to. Default: 10000.
--max-worker-port: Maximum port number worker can be bound to. Default: 10999.
those ports need to be opened in firewall across the machines because ray use those for worker nodes?

what is the difference between --node-manager-port and --port on head node?

thank you.

Thank you very much guys, you've helped me a lot!
(@dHannasch very good explanation in your commit)
Regards,
Alex.

Was this page helpful?
0 / 5 - 0 ratings