Kops: error updating from kops 1.5.3 > 1.6

Created on 19 May 2017  路  10Comments  路  Source: kubernetes/kops

Tried to update our cluster (3 masters, 3 nodes, no rbac) running weave from 1.5.3 > 1.6 using the drain & validate flag and hadn't created the configmap so it failed after updating the first master.

Added the required config map and continued with the update, the next master seemed to complete fine but the last master node wouldn't validate. Weave hadn't started on the node and looking at the log a lot of Unable to register node "ip-172-20-37-16.eu-west-1.compute.internal" with API server: Post https://127.0.0.1/api/v1/nodes: dial tcp 127.0.0.1:443: getsockopt: connection refused

Deleting the master resulted in the same errors when it started back up. Deleting the second master also made this one fail with the same error.

lifecyclrotten

Most helpful comment

Seems like the weave-net daemonset didn't get updated, added the tolerations by editing the daemonset then all masters became ready, seems to be related to #2366

All 10 comments

We need the logs for the scheduler and the controller that are active

kube scheduler log is empty kube controller log

anything we can try, or do we need to re-create our cluster?

Seems like the weave-net daemonset didn't get updated, added the tolerations by editing the daemonset then all masters became ready, seems to be related to #2366

Thanks for the update. Will try to reproduce.

We are having a similar issue. I wonder, could this be a race condition for how the DaemonSet is updated?

If the Daemonset replacement request goes to a 1.5 node, since the tolerations attribute was not present, wouldn't the controller simply ignore that attribute?

An example of this I believe can be seen by our "untouched" upgraded Daemonset. The critical thing to note is that in the last-applied-configuration we see the tolerations: are present, however in the actual output of the Daemonset they are not.

Would it be possible or prudent to make the tolerations redundant and keep the annotation in conjunction with the explicit Tolerations?

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: DaemonSet
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"extensions/v1beta1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"name":"weave-net","role.kubernetes.io/networking":"1"},"name":"weave-net","namespace":"kube-system"},"spec":{"template":{"metadata":{"labels":{"name":"weave-net","role.kubernetes.io/networking":"1"}},"spec":{"containers":[{"command":["/home/weave/launch.sh"],"image":"weaveworks/weave-kube:1.9.4","livenessProbe":{"httpGet":{"host":"127.0.0.1","path":"/status","port":6784},"initialDelaySeconds":30},"name":"weave","resources":{"limits":{"cpu":"100m","memory":"200Mi"},"requests":{"cpu":"100m","memory":"200Mi"}},"securityContext":{"privileged":true},"volumeMounts":[{"mountPath":"/weavedb","name":"weavedb"},{"mountPath":"/host/opt","name":"cni-bin"},{"mountPath":"/host/home","name":"cni-bin2"},{"mountPath":"/host/etc","name":"cni-conf"},{"mountPath":"/host/var/lib/dbus","name":"dbus"},{"mountPath":"/lib/modules","name":"lib-modules"}]},{"image":"weaveworks/weave-npc:1.9.4","name":"weave-npc","resources":{"limits":{"cpu":"100m","memory":"200Mi"},"requests":{"cpu":"100m","memory":"200Mi"}},"securityContext":{"privileged":true}}],"hostNetwork":true,"hostPID":true,"restartPolicy":"Always","securityContext":{"seLinuxOptions":{"type":"spc_t"}},"serviceAccountName":"weave-net","tolerations":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"}],"volumes":[{"emptyDir":{},"name":"weavedb"},{"hostPath":{"path":"/opt"},"name":"cni-bin"},{"hostPath":{"path":"/home"},"name":"cni-bin2"},{"hostPath":{"path":"/etc"},"name":"cni-conf"},{"hostPath":{"path":"/var/lib/dbus"},"name":"dbus"},{"hostPath":{"path":"/lib/modules"},"name":"lib-modules"}]}}}}
    creationTimestamp: 2017-05-30T14:51:23Z
    generation: 3
    labels:
      name: weave-net
      role.kubernetes.io/networking: "1"
    name: weave-net
    namespace: kube-system
    resourceVersion: "356995"
    selfLink: /apis/extensions/v1beta1/namespaces/kube-system/daemonsets/weave-net
    uid: 735e25fb-4547-11e7-b4c9-123be1737864
  spec:
    selector:
      matchLabels:
        name: weave-net
        role.kubernetes.io/networking: "1"
    template:
      metadata:
        creationTimestamp: null
        labels:
          name: weave-net
          role.kubernetes.io/networking: "1"
      spec:
        containers:
        - command:
          - /home/weave/launch.sh
          image: weaveworks/weave-kube:1.9.4
          imagePullPolicy: IfNotPresent
          livenessProbe:
            failureThreshold: 3
            httpGet:
              host: 127.0.0.1
              path: /status
              port: 6784
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          name: weave
          resources:
            limits:
              cpu: 100m
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 200Mi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /weavedb
            name: weavedb
          - mountPath: /host/opt
            name: cni-bin
          - mountPath: /host/home
            name: cni-bin2
          - mountPath: /host/etc
            name: cni-conf
          - mountPath: /host/var/lib/dbus
            name: dbus
          - mountPath: /lib/modules
            name: lib-modules
        - image: weaveworks/weave-npc:1.9.4
          imagePullPolicy: IfNotPresent
          name: weave-npc
          resources:
            limits:
              cpu: 100m
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 200Mi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
        dnsPolicy: ClusterFirst
        hostNetwork: true
        hostPID: true
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
          seLinuxOptions:
            type: spc_t
        serviceAccount: weave-net
        serviceAccountName: weave-net
        terminationGracePeriodSeconds: 30
        volumes:
        - emptyDir: {}
          name: weavedb
        - hostPath:
            path: /opt
          name: cni-bin
        - hostPath:
            path: /home
          name: cni-bin2
        - hostPath:
            path: /etc
          name: cni-conf
        - hostPath:
            path: /var/lib/dbus
          name: dbus
        - hostPath:
            path: /lib/modules
          name: lib-modules
    updateStrategy:
      type: OnDelete
  status:
    currentNumberScheduled: 7
    desiredNumberScheduled: 7
    numberAvailable: 7
    numberMisscheduled: 0
    numberReady: 7
    observedGeneration: 3
    updatedNumberScheduled: 2
kind: List
metadata: {}
resourceVersion: ""
selfLink: ""

Digging in deeper, this seems to be somewhat similar to https://github.com/kubernetes/kubernetes/issues/46073

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Was this page helpful?
0 / 5 - 0 ratings