Openshift-ansible: Cannot login to registry from Master

Created on 14 Jun 2016 · 26Comments · Source: openshift/openshift-ansible

Hello all I was looking at https://github.com/openshift/openshift-ansible/issues/632 and I am having a similar issue. My setup is one master and one node. I would like to be able to do a docker login from the master to the node where the registry is running but I keep getting this error.

Error response from daemon: invalid registry endpoint "http://172.30.58.204:5000/v0/". HTTPS attempt: unable to ping registry endpoint https://172.30.58.204:5000/v0/ v2 ping attempt failed with error: Get https://172.30.58.204:5000/v2/: dial tcp 172.30.58.204:5000: i/o timeout v1 ping attempt failed with error: Get https://172.30.58.204:5000/v1/_ping: dial tcp 172.30.58.204:5000: i/o timeout. HTTP attempt: unable to ping registry endpoint http://172.30.58.204:5000/v0/ v2 ping attempt failed with error: Get http://172.30.58.204:5000/v2/: dial tcp 172.30.58.204:5000: i/o timeout v1 ping attempt failed with error: Get http://172.30.58.204:5000/v1/_ping: dial tcp 172.30.58.204:5000: i/o timeout

I can log into the registry from the node though just not the master.

kinrfe prioritP2

Source

irvingwa

Most helpful comment

You thought I was going to disappear without ever posting a solution, didn't you? How could you every accuse me of something so terrible? I've narrowed it down to one of the following sysctl params:

net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0

Setting these all to '1' fixed the problem... only have to narrow it down to one now.

finerm on 19 Jul 2016

😄2 🎉1

All 26 comments

Hey @irvingwa, is SDN traffic allowed between your master and node (4789 UDP)?

abutcher on 14 Jun 2016

Yup.

From the master:

Chain OS_FIREWALL_ALLOW (1 references)
ACCEPT udp -- anywhere anywhere state NEW udp dpt:4789

From the node:

Chain OS_FIREWALL_ALLOW (1 references)
ACCEPT udp -- anywhere anywhere state NEW udp dpt:4789

irvingwa on 14 Jun 2016

@irvingwa Is there a firewall between these two systems? Are they in a cloud environment where a security group or network ACL could be interfering?

abutcher on 14 Jun 2016

No firewall running. No security group or network ACL.

irvingwa on 14 Jun 2016

Is your master also configured as a node and set to unscheduleable? I'm assuming it is since 4789/udp has been added to iptables on the master.

If everything appears to be in order with networking you should walk through the SDN troubleshooting guide.

abutcher on 14 Jun 2016

The master is set to READY. I also looked at the firewalld log and am getting some errors. Does something try and add iptable rules like '/sbin/iptables -w2 -t filter -I IN_public_allow 1 -m tcp -p tcp -m limit --limit 25/minute --limit-burst 100 -j ACCEPT' ? If so that is not where my iptables lives. Mine is under /usr/sbin/iptables.

irvingwa on 14 Jun 2016

is there any sort of dependency on iptables forwarding for openshift networking that you know of?

irvingwa on 14 Jun 2016

@abutcher sorry one last question. Looking at that trouble shooting dock when I run ip route and I dont see any 10.128.x.x lines. I am thinking this might be the issue.

irvingwa on 14 Jun 2016

Are there any relevant errors in your node logs?

abutcher on 15 Jun 2016

There are a good amount of errors in there. And I dont see any Output of setup script: in the log anywhere.
Output.txt

irvingwa on 15 Jun 2016

Should the ovs-ofctl command be on my nodes?

irvingwa on 16 Jun 2016

@irvingwa Yes, ovs-ofctl will be available on nodes.

From the logs, it looks like there are some issues with DNS timeouts. Is port 53/udp open on your master? I saw one other timeout talking to etcd on port 2379/tcp.

abutcher on 16 Jun 2016

Both are open. I ran the debug script on my master and when it ran on my node with the registry it printed:

Could not find port for 10.1.0.2!

irvingwa on 16 Jun 2016

Could you post the full output from the debug script?

abutcher on 20 Jun 2016

https://www.dropbox.com/s/a1sij9rkk5iehz2/debug.tar?dl=0

Sorry this setup is a little different. 1 master and 2 nodes. Registry is on Node 2 and Node 1 cant login to it.

Edit: Sorry for changing the setup it seemed that most people were doing it that way.

irvingwa on 20 Jun 2016

I am experiencing the same issue as @irvingwa . I am performing a containerized installation with the latest version of openshift-ansible. My environment is as follows:

RHEL 7.2
Docker 1.9.1-40
One non-schedulable master + node
One schedulable node
(Non-SDN) networking between the two servers is established, and there is no firewall other than iptables.
I have tried with both v 1.1.6 of the origin images, as well as the very latest, with the same results.

My registry is successfully deployed on the node and 'oc get service' shows it listening on the 172.30.0.0/16 network. On the node where the registry is deployed, I can telnet into something like 172.30.71.188:5000. I cannot do so from the master, though. traceroute seems to show the traffic just dying at the 10.1.0.1 interface on the master.

I have executed the debug.sh script as well, but like @irvingwa , I receive a "Could not find port for 10.1.1.3!" error. Regardless, the output is available at the link below:

https://www.dropbox.com/s/2aqcgsqysu8f39p/openshift-sdn-debug-2016-06-21.tgz?dl=0

finerm on 21 Jun 2016

Should docker0 be listed when I do an ip route? It seems to be missing.

irvingwa on 21 Jun 2016

I can log into the registry from the node though just not the master.

The master doesn't automatically have access to the SDN; in order to be able to access pods, the master needs to also be made an (unschedulable) OpenShift node. The ansible install handles this automatically; how did you install this cluster?

danwinship on 22 Jun 2016

Yes I marked the node as unscheduable.

irvingwa on 22 Jun 2016

No, in the debug output you provided, the master host is not running atomic-openshift-node

danwinship on 22 Jun 2016

Ya Sorry, when it was not working I switched my setup up (1 master and 2 nodes). Thinking it was something to do with the master. The registry is on node 2 and I can't login from node 1.

irvingwa on 22 Jun 2016

I have discovered that I can only reproduce this problem using a specific RHEL VM template; one that has security lockdowns. This is likely what @irvingwa is experiencing as well (we are coworkers). I am in the process of trying to identify what the offending configuration is and I hope to post it here for posterity.

finerm on 23 Jun 2016

You thought I was going to disappear without ever posting a solution, didn't you? How could you every accuse me of something so terrible? I've narrowed it down to one of the following sysctl params:

Setting these all to '1' fixed the problem... only have to narrow it down to one now.

finerm on 19 Jul 2016

😄2 🎉1

When we get a pre-requisites playbook in place, we should test for the problematic sysctl(s).

detiber on 19 Jul 2016

So, this is no longer an issue for us and I wish I had a cleaner answer as to why, but I think this might remain a mystery of our wacky environment. We did the following and can no longer reproduce the issue (and no longer need to tinker with kernel params):

Updated openshift-ansible to the latest (from a version that was from a few months ago)
Installed Docker manually instead of via Puppet module on the master/node

Prior to doing the above, tinkering with kernel params got past the problem. My best guess is that something about how we were installing Docker + how we locked down our VMs was a problem.

If it somehow helps someone in the future, we were using the garethr/docker Puppet module with near-vanilla settings to install Docker plus the kernel parameters in my previous post (set to =1 instead of =0).

Thanks for your help, gents.

finerm on 19 Jul 2016

@finerm thanks for the follow up, I'll go ahead and close out this issue for now.

detiber on 19 Jul 2016

Was this page helpful?

0 / 5 - 0 ratings