Origin: Latest docker version breaks kubelet in 1.5.0-alpha.3

Created on 7 Mar 2017  Â·  43Comments  Â·  Source: openshift/origin

20s       20s       1         router-1-deploy           Pod                 Warning   FailedSync         {kubelet 10.34.129.45}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

@csrwng @jimmidyson

componenkubernetes kinbug prioritP2

All 43 comments

Running on Fedora 25

I tried with the latest origin master and I get the same error (on fedora 25). However, I don't see the error on Docker for Mac.

I am seeing the issue with Docker for Mac 17.03.0-ce, commit 60ccb22.

oc version oc v1.4.1+3f9807a (latest from Homebrew)

@justinclayton are you not able to run pods at all?

Sorry, I'm in the wrong issue. I'm trying to run oc cluster up. Looks like #13284 is what I'm looking for.

@justinclayton you'll need an oc binary from the origin master

On the latest Docker for Mac I'm experiencing issues running oc cluster up --metrics. The pods do not come up.
Any suggestions on what version of openshift cli I should use for this to work?

@zmhassan what are your versions of docker and oc, and do you see any events in the openshift-infra namespace about the metrics pods?

@csrwng

zhassan:~ zhassan$ oc cluster up --metrics
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... FAIL
   Error: Minor number must not contain leading zeroes "03"
zhassan:~ zhassan$ oc version
oc v1.5.0-alpha.3+cf7e336
kubernetes v1.5.2+43a9be4
features: Basic-Auth
Unable to connect to the server: EOF
zhassan:~ zhassan$ docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 10:40:59 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64
 Experimental: true

@zmhassan you need the latest origin master. If you somehow have access to a cluster, you can build your own binary using https://github.com/csrwng/build-origin

Hi @csrwng I could build from source as I already have all the openshift source. This step your suggestion is only going to create binary that I can easily download from github or build myself.

So if I build latest it should work is what your saying?

yes

Nope doesn't work.

zhassan:kubernetes zhassan$ oc get pods
NAME                            READY     STATUS    RESTARTS   AGE
docker-registry-1-deploy        1/1       Running   0          54s
docker-registry-1-dpcwn         0/1       Running   0          28s
persistent-volume-setup-vh3pf   1/1       Running   0          54s
router-1-deploy                 1/1       Running   0          54s
router-1-ndsh6                  0/1       Running   0          29s
zhassan:kubernetes zhassan$ oc project openshift-infra
Now using project "openshift-infra" on server "https://127.0.0.1:8443".
zhassan:kubernetes zhassan$ oc get pods
NAME                         READY     STATUS    RESTARTS   AGE
metrics-deployer-pod-5ctbd   1/1       Running   0          15s
metrics-deployer-pod-lctjk   0/1       Error     0          1m
zhassan:kubernetes zhassan$ oc describe pod metrics-deployer-pod-lctjk
Name:           metrics-deployer-pod-lctjk
Namespace:      openshift-infra
Security Policy:    restricted
Node:           192.168.65.2/192.168.65.2
Start Time:     Wed, 08 Mar 2017 14:31:06 -0500
Labels:         controller-uid=c4d30a9d-0435-11e7-b875-c68515d70515
            job-name=metrics-deployer-pod
Status:         Failed
IP:         172.17.0.5
Controllers:        Job/metrics-deployer-pod
Containers:
  deployer:
    Container ID:   docker://046254a8829ecd5580eaa8430507bd1ecec285b2d5a9d62377e856907962720b
    Image:      openshift/origin-metrics-deployer:v1.5.0-alpha.3
    Image ID:       docker-pullable://openshift/origin-metrics-deployer@sha256:e3ae4eab3002fa532a30c3dcdfc320dd0a588e4adbc59896139cc1bff4c4f06e
    Port:
    State:      Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Wed, 08 Mar 2017 14:31:47 -0500
      Finished:     Wed, 08 Mar 2017 14:31:57 -0500
    Ready:      False
    Restart Count:  0
    Volume Mounts:
      /etc/deploy from empty (rw)
      /secret from secret (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from metrics-deployer-token-cpn75 (ro)
    Environment Variables:
      PROJECT:              openshift-infra (v1:metadata.namespace)
      POD_NAME:             metrics-deployer-pod-lctjk (v1:metadata.name)
      IMAGE_PREFIX:         openshift/origin-
      IMAGE_VERSION:            v1.5.0-alpha.3
      MASTER_URL:           https://kubernetes.default.svc:443
      HAWKULAR_METRICS_HOSTNAME:    metrics-openshift-infra.127.0.0.1.nip.io
      MODE:             deploy
      REDEPLOY:             false
      IGNORE_PREFLIGHT:         false
      USE_PERSISTENT_STORAGE:       true
      CASSANDRA_NODES:          1
      CASSANDRA_PV_SIZE:        10Gi
      METRIC_DURATION:          7
      HEAPSTER_NODE_ID:         nodename
      METRIC_RESOLUTION:        10s
Conditions:
  Type      Status
  Initialized   True
  Ready     False
  PodScheduled  True
Volumes:
  empty:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  secret:
    Type:   Secret (a volume populated by a Secret)
    SecretName: metrics-deployer
  metrics-deployer-token-cpn75:
    Type:   Secret (a volume populated by a Secret)
    SecretName: metrics-deployer-token-cpn75
QoS Class:  BestEffort
Tolerations:    <none>
Events:
  FirstSeen LastSeen    Count   From            SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----            -------------   --------    ------      -------
  1m        1m      1   {default-scheduler }            Normal      Scheduled   Successfully assigned metrics-deployer-pod-lctjk to 192.168.65.2
  1m        1m      1   {kubelet 192.168.65.2}          Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

  1m    1m  1   {kubelet 192.168.65.2}  spec.containers{deployer}   Normal  Pulling     pulling image "openshift/origin-metrics-deployer:v1.5.0-alpha.3"
  49s   49s 1   {kubelet 192.168.65.2}  spec.containers{deployer}   Normal  Pulled      Successfully pulled image "openshift/origin-metrics-deployer:v1.5.0-alpha.3"
  48s   48s 1   {kubelet 192.168.65.2}  spec.containers{deployer}   Normal  Created     Created container with docker id 046254a8829e; Security:[seccomp=unconfined]
  47s   47s 1   {kubelet 192.168.65.2}  spec.containers{deployer}   Normal  Started     Started container with docker id 046254a8829e
  47s   47s 1   {kubelet 192.168.65.2}                  Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "deployer" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

  35s   35s 1   {kubelet 192.168.65.2}  spec.containers{deployer}   Normal  Killing Killing container with docker id 046254a8829e: Need to kill pod.

@zmhassan what version of the images are you using ? (Are you specifying --version=blah with oc cluster up)

@zmhassan so I tried locally and I did see the errors but eventually pods ran and metrics came up.

It looks like something changed in the kubelet recently that broke things. Running cluster up with --version=v1.5.0-alpha.3 logs the message about the version parsing failing, but the pods eventually do start. With --version=v1.5.0-rc.0, the pods don't start anymore.

/cc @derekwaynecarr

When I try oc cluster up --version=v1.5.0-alpha.3 and create a project, the build fail immediately with

error: cannot connect to the server: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory

I then tried v1.5.0-alpha.3 client tools with oc cluster up --version=v1.5.0-alpha.3 fails immediately with

-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... FAIL
Error: Minor number must not contain leading zeroes "03"

Looks like I'm stuck. I was trying to use origin for an internal class in a week. Is there a way to install an older docker engine on CentOS?

I'm still having issues and I'm not seeing metrics show up at all.

@mbechauf I believe the default centos yum repo has docker 1.12.x

@zmhassan are you getting any pods to start at all? Are you using --version=v1.5.0-alpha.3 to start your cluster?

I managed to get it all working. Thank you.

@csrwng Couldn't find a yum repo since Docker CE is the only published version, but was able to install an 1.13 RPM from www.dockerproject.org. So far, it's working fine with v1.5.0-rc.0.

got a little further, but the issue is now during build:
Failed sync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

@shveik is your registry up and running?

great question, please bare with me since it's my first day playing with this, and i am following instructions here: https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md#macos-with-docker-for-mac

so, looks like the registry is running:
➜ ~ oc get svc docker-registry -n default
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.1.1 5000/TCP 16m

however, the login is not working, since OPENSHIFT_TOKEN is null:
➜ ~ OPENSHIFT_TOKEN=$(oc whoami -t)
error: no token is currently in use for this session

i do have an insecure registry configured for 172.30.0.0/16

@shveik did you specify --version=v1.5.0-alpha.3 with 'oc cluster up'? If not, try that first

If you are logged in as system:admin, you will not have a token, since you are logged in via certificate. If you need a token, you need to be a regular user.

All my attempts (even with previous Alpha versions) to get beyond the
Docker version issues have failed. I ended up installing an old Docker 1.13
version from https://yum.dockerproject.org/. With that, v1.5.0-rc.0 has
worked just fine. I hope it stays that way. Will need that system for an
intro class to OpenShift next week ...

Best,
Michael

On Thu, Mar 16, 2017 at 12:34 PM, Alex Reznik notifications@github.com
wrote:

great question, please bare with me since it's my first day playing with
this, and i am following instructions here: https://github.com/openshift/
origin/blob/master/docs/cluster_up_down.md#macos-with-docker-for-mac

so, looks like the registry is running:
➜ ~ oc get svc docker-registry -n default
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.1.1 5000/TCP 16m

however, the login is not working, since OPENSHIFT_TOKEN is null:
➜ ~ OPENSHIFT_TOKEN=$(oc whoami -t)
error: no token is currently in use for this session

i do have an insecure registry configured for 172.30.0.0/16

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/issues/13281#issuecomment-287167431,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJqqVDBDyEZHGrZQEfbhZ7zG6SM6sDo5ks5rmY7DgaJpZM4MVd22
.

@csrwng that was it, thank you. so when running with version, here is the output, so basically, it's the client and server versions that differ:
➜ ~ oc version
oc v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://127.0.0.1:8443
openshift v1.5.0-alpha.3+cf7e336
kubernetes v1.5.2+43a9be4

and here is without version:
➜ ~ oc version
oc v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://127.0.0.1:8443
openshift v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4

how to solve that when using the server binary instead of using oc cluster up?

I use ./openshift start and experience the same issue. any way to specify the version there?

@mbechauf on CentOS for installing Docker, use yum install docker -- you will have 1.12 available to you in normal streams and 1.13 in @epel-testing

I see a similar issue with current master:
openshift v3.6.0-alpha.0+611176d-121
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Build is pending when I try to create a new ruby-ex app. See following message in Build -> Last Build -> Events tab:

Time Severity Reason Message
2:38:40 PM Warning Failed sync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""
18 times in the last 3 minutes

The kubelet is still not working with the new Docker version string. Easiest thing to do is to downgrade Docker.

Got stuck on the first tutorial.

OS: Fedora25

Version:

oc v3.6.0-alpha.1+7044e57-29
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.1.100:8443
openshift v3.6.0-alpha.1+7044e57-29
kubernetes v1.5.2+43a9be4

The error:

Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version \"17.04.0-ce\": illegal zero-prefixed version component \"04\" in \"17.04.0-ce\""

Tutorial worked on:

oc v1.4.1+3f9807a
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.1.100:8443
openshift v1.4.1+3f9807a
kubernetes v1.4.0+776c994

Which docker version should I use to make it work, while we don't have the [final] working solution?

OpenShift Origin 1.5 should use Docker 1.12.

On mac os with latest Docker for Mac Version 17.03.1-ce-mac5 (16048) I was able to oc cluster up with the oc from openshift-origin-client-tools-v1.5.0 but was unable to deploy an image seeing:

"runContainer: docker: failed to parse docker version" 17.03 illegal zero-prefixed version component \"03\" in \"17.03.1-ce\""

changing the oc cluster up command to name the alpha version in the chain above solved it:

./oc cluster up --version=v1.5.0-alpha.3   \
                --use-existing-config   \
                --host-data-dir=/oc_data \
                --metrics=true

The deployment just automatically ran and succeeded on the older version.

An alternative for now is also to downgrade to Docker 1.13.1 for Mac:
https://download.docker.com/mac/stable/1.13.1.15353/Docker.dmg

We will need to cherry-pick https://github.com/kubernetes/kubernetes/pull/44068, probably to solve the problem

I'll pick it up once the upstream PR merges and the rebase lands

I've cherry-picked the upstream fix in attached PR.

Was this page helpful?
0 / 5 - 0 ratings