Origin: Get https://docker-registry.default.svc:5000/v2/: net/http: TLS handshake timeout

Created on 18 Apr 2019 · 15Comments · Source: openshift/origin

Version

oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://master.service.dc1.consul:443
openshift v3.11.0+5a84bad-168
kubernetes v1.11.0+d4cacc0

Steps To Reproduce

fresh install of Openshift 3.11
try to deploy the default Appache template to test the cluster

Current Result

Cloning "https://github.com/openshift/httpd-ex.git " ...
--
  | Commit: 0ac6da93a1f65fe9175cb1b7838cfca7b23d5fbe (Merge pull request #15 from adambkaplan/sclorg-rename)
  | Author: Honza Horak <[email protected]>
  | Date:   Fri Aug 3 13:08:12 2018 +0200
  | pulling image error : Get https://docker-registry.default.svc:5000/v2/:  net/http: TLS handshake timeout
  | error: build error: unable to get docker-registry.default.svc:5000/openshift/httpd@sha256:d1256b39182b0ac5290c946dc44fc11055524683113a1b5e3a55d83044a185cb

Expected Result

The deployment is ok

Additional Information

I haven't overriden the certificate. As the doc says "you should not need to replace the certificate used by the registry service itself."

The registry pod is running

oc get pod
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-1-94xbh    1/1       Running   0          1d
registry-console-1-mn8tb   1/1       Running   0          1d
router-1-jfdtp             1/1       Running   0          1d
router-1-sr44x             1/1       Running   0          1d

But when I try to connect to the service

oc get service
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                   AGE
docker-registry    ClusterIP   172.30.95.44    <none>        5000/TCP                  1d
kubernetes         ClusterIP   172.30.0.1      <none>        443/TCP,53/UDP,53/TCP     1d
openshift-master   ClusterIP   172.30.34.165   <none>        443/TCP                   1d
registry-console   ClusterIP   172.30.141.37   <none>        9000/TCP                  1d
router             ClusterIP   172.30.32.90    <none>        80/TCP,443/TCP,1936/TCP   1d

openssl s_client -connect 172.30.95.44:5000
CONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 289 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : 0000
    Session-ID: 
    Session-ID-ctx: 
    Master-Key: 
    Key-Arg   : None
    Krb5 Principal: None
    PSK identity: None
    PSK identity hint: None
    Start Time: 1555591482
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---

I checked taht the registry certificate was ok with the ca

openssl verify -verbose -CAfile /etc/origin/master/ca.crt /etc/origin/master/registry.crt 
/etc/origin/master/registry.crt: OK

Info about the certificate:

sudo openssl x509 -in /etc/origin/master/registry.crt -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 14 (0xe)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=openshift-signer@1555495169
        Validity
            Not Before: Apr 17 10:08:49 2019 GMT
            Not After : Apr 16 10:08:50 2021 GMT
        Subject: CN=172.30.95.44
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:b4:ff:6c:a7:2d:2b:35:22:d6:21:b6:5a:45:e9:
                    f9:b5:42:f9:a8:38:60:90:48:71:10:28:bf:55:cf:
                    aa:d5:48:0a:70:62:fc:4f:97:52:de:aa:ad:d6:8d:
                    39:60:9a:64:d2:c2:20:98:91:65:01:b8:2a:e8:fb:
                    e5:6f:f8:96:c0:19:6d:62:c2:6f:74:72:43:eb:0d:
                    f8:bd:18:5e:e3:8b:83:00:f8:22:c1:96:f5:ad:74:
                    c0:18:38:99:c7:74:5c:3c:19:07:20:c5:9e:6c:fe:
                    61:36:07:1c:fa:6b:3f:da:eb:24:90:ea:19:53:34:
                    1c:4a:45:9c:b3:39:2f:f1:52:52:ed:4e:fe:35:cd:
                    b6:6d:81:4f:f5:2c:65:7a:c3:35:4a:da:03:a8:79:
                    41:fc:6a:62:63:1c:49:b4:c8:6e:90:2c:8e:ed:7e:
                    ee:81:41:ab:da:49:77:11:4a:8c:5e:c0:c1:20:89:
                    b7:9f:b3:37:56:0b:d9:2d:aa:c1:66:42:5c:3b:0a:
                    c1:da:db:79:fd:b1:d7:36:cb:a1:e7:f0:88:27:02:
                    2f:74:fd:26:81:8a:82:42:e9:73:00:02:cd:55:2d:
                    15:14:9f:d2:9c:60:fa:7f:0b:88:6b:24:79:ab:d1:
                    f6:f1:dd:a0:74:60:3f:f0:eb:e5:c0:79:d0:f7:dc:
                    b2:a5
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Alternative Name: 
                DNS:docker-registry-default.gitops.kmt.orange.com, DNS:docker-registry.default.svc, DNS:docker-registry.default.svc.cluster.local, DNS:172.30.95.44, IP Address:172.30.95.44
    Signature Algorithm: sha256WithRSAEncryption
         5c:ac:6d:d8:b1:df:f8:0f:d2:3f:76:26:1b:94:97:38:ad:10:
         92:c6:2e:f1:5e:e1:fc:d1:2c:ce:59:fd:a3:0e:57:58:12:b8:
         2c:b4:ee:bc:36:86:95:4b:46:f1:7e:ff:12:a1:53:dd:85:1c:
         bc:3c:27:8b:0b:e7:ff:cc:b0:d7:7e:b1:9f:9a:c0:fc:47:4f:
         4e:e9:f0:51:ba:1e:fb:c5:76:49:7a:fa:3d:ff:36:4a:79:79:
         59:0e:8f:54:90:08:7e:f1:7a:f4:9e:96:67:72:82:95:08:c6:
         93:80:f0:f2:d6:65:cf:59:82:94:f0:13:de:a1:fc:1e:0e:f4:
         dd:15:59:4a:12:99:20:dd:6c:25:ed:af:49:ab:a0:f2:cf:f3:
         a9:be:2d:7d:3f:6b:75:d5:d9:50:9d:a9:8a:62:79:82:64:9a:
         63:36:4d:86:79:12:e1:0b:e7:ca:80:af:84:41:be:20:b5:50:
         dc:6b:1d:ac:c8:38:58:c0:35:16:10:41:59:c4:20:a5:c5:bd:
         1e:79:9b:42:8f:da:52:06:38:3a:95:8a:58:5d:84:d9:fb:08:
         e9:e8:fa:66:d2:6c:2a:1e:6c:08:d9:84:ce:e4:cc:1c:fc:c2:
         2f:95:24:c7:46:97:5b:48:2b:da:c8:e7:9c:c0:bb:bd:66:03:
         38:17:50:48

Any ideas?

lifecyclrotten

Source

Sispheor

Most helpful comment

We experienced the same issue in our environment and it was a MTU problem. We'll probably be filing another issue for this, but basically our env uses 1450 (instead of 1500) for MTU and in that case OpenShift install should adjust it to 1400. This has worked in the past, but recently it didn't - i.e.: it left everything at 1450 - and we experienced this issue. After manually adjusting our nodes to use 1400, it worked fine again.

oybed on 10 May 2019

👍3

All 15 comments

What's the output of

oc logs docker-registry-1-94xbh -n default

But I think your cluster has generally issues. Do you have some errors in your cluster console (Login to Web Console, on the top select "Cluster Console", Home -> Status)?
Can you pull images through the external registry URL?

niiku on 19 Apr 2019

seems fine in this pod. I restarted it to get the start

oc logs docker-registry-5-q7tpg
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_PORT" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_PORT_9000_TCP" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_PORT_9000_TCP_ADDR" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_PORT_9000_TCP_PORT" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_PORT_9000_TCP_PROTO" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_SERVICE_HOST" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_SERVICE_PORT" 
time="2019-04-19T07:13:04Z" level=warning msg="Ignoring unrecognized environment variable REGISTRY_CONSOLE_SERVICE_PORT_REGISTRY_CONSOLE" 
time="2019-04-19T07:13:04Z" level=info msg="DEPRECATED: \"OPENSHIFT_DEFAULT_REGISTRY\" is deprecated, use the 'REGISTRY_OPENSHIFT_SERVER_ADDR' instead" 
time="2019-04-19T07:13:04.349538519Z" level=info msg="start registry" distribution_version=v2.6.2+unknown go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 openshift_version=v3.11.0+9cca740-6 
time="2019-04-19T07:13:04.349978377Z" level=info msg="quota enforcement disabled" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:04.351276327Z" level=info msg="redis not configured" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:04.351427578Z" level=info msg="Starting upload purge in 20m0s" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:04.369218263Z" level=info msg="using openshift blob descriptor cache" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:04.370089552Z" level=info msg="Using \"docker-registry.default.svc:5000\" as Docker Registry URL" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:04.370119225Z" level=info msg="listening on :5000, tls" go.version=go1.10.3 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:05.806542997Z" level=info msg=response go.version=go1.10.3 http.request.host="10.131.0.23:5000" http.request.id=2f3aab3d-74f8-4cc7-8889-ed8c0b007bde http.request.method=GET http.request.remoteaddr="10.131.0.1:46798" http.request.uri=/healthz http.request.useragent=kube-probe/1.11+ http.response.duration="68.295µs" http.response.status=200 http.response.written=0 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:15.799853427Z" level=info msg=response go.version=go1.10.3 http.request.host="10.131.0.23:5000" http.request.id=c2dfa0dd-d514-4ed8-952b-7e18fd6a912f http.request.method=GET http.request.remoteaddr="10.131.0.1:46874" http.request.uri=/healthz http.request.useragent=kube-probe/1.11+ http.response.duration="65.096µs" http.response.status=200 http.response.written=0 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54 
time="2019-04-19T07:13:20.280488902Z" level=info msg=response go.version=go1.10.3 http.request.host="10.131.0.23:5000" http.request.id=117d61a5-d170-4a89-b044-c97d59d4a1ce http.request.method=GET http.request.remoteaddr="10.131.0.1:46908" http.request.uri=/healthz http.request.useragent=kube-probe/1.11+ http.response.duration="54.55µs" http.response.status=200 http.response.written=0 instance.id=2ef79f83-a7d9-4d64-b093-0677f8fdbf54

We can see ""listening on :5000, tls", so TLS should be ok.

Sispheor on 19 Apr 2019

I can access the registry from my worksation through the public route

oc get route
NAME               HOST/PORT                                        PATH      SERVICES           PORT      TERMINATION   WILDCARD
docker-registry    docker-registry-default.gitops.kmt.orange.com              docker-registry    <all>     passthrough   None
openshift-master   openshift.gitops.kmt.orange.com                            openshift-master   443       passthrough   None
registry-console   registry-console-default.gitops.kmt.orange.com             registry-console   <all>     passthrough   None

docker login docker-registry-default.gitops.kmt.orange.com -u nico -p my_token
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/nico/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

Seems my issue the same as here: https://bugzilla.redhat.com/show_bug.cgi?id=1641134

Sispheor on 19 Apr 2019

oybed on 10 May 2019

👍3

We experienced the same issue in our environment and it was a MTU problem. We'll probably be filing another issue for this, but basically our env uses 1450 (instead of 1500) for MTU and in that case OpenShift install should adjust it to 1400. This has worked in the past, but recently it didn't - i.e.: it left everything at 1450 - and we experienced this issue. After manually adjusting our nodes to use 1400, it worked fine again.

I had a Openshift setup running over Openstack VM's with this same problem and it worked perfectly. The networks settings of MTU was 1400 for eth0, 1500 for docker0 and 1450 for tun0. Changing docker0 to 1300 and tun0 to 1350 solved my issue.

Rikeard on 14 May 2019

I just got the same issue with okd 3.11. And changing the MTU did not work. When I login to docker registry container, I am not able to see the registry structure registry/v2.

Any suggestions please ?

deevya2000 on 3 Jun 2019

How do I change the MTU of tun0? Please

woland7 on 4 Jun 2019

@woland7 there are multiple ways you can do this: During install you can set the openshift_node_sdn_mtu inventory variable, or set the node configmap values for MTU. Or, post install you can change the configmap values for the MTU for your nodes (including any infra and/or master nodes) and do a node reboot.

> oc get cm -n openshift-node node-config-compute -o yaml | grep -B1 mtu
    networkConfig:
      mtu: 1400

oybed on 5 Jun 2019

👍1

Thank you. I actually managed to change it but still I encountered the same problem. For the moment, I just worked around it by downgrading to 3.9. Anyway why the MTU value would break the TLS handshake?

woland7 on 5 Jun 2019

@woland7 Can you please share the inventory file settings that you used for 3.9 installation ?

deevya2000 on 5 Jun 2019

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 3 Sep 2019

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot on 3 Oct 2019

/close

woland7 on 3 Oct 2019

@woland7: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.