Kops: Using kube2iam can prevent master node cycling

Created on 17 Nov 2016  路  11Comments  路  Source: kubernetes/kops

The default deployment spec for kube2iam is as a daemonset with a flag to modify iptables on the host:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

Once run on an instance, this adds an iptables rule so that containers on that host cannot access 169.254.169.254 directly and must instead go through kube2iam to get desired roles. In general, this is perfect but the problem occurs when you need to cycle your master nodes in kops because they need to manage Route53 entries through dns-controller, which obviously does not have a kube2iam role associated with it.

The end-result is Route53 has previous DNS entries for the API address, though etcd is updated since that isn't through dns-controller. Even if you stop kube2iam temporarily, the iptables rule is already written so this would need to happen before the new master nodes come back up.

Workarounds:

  • Delete kube2iam before new master nodes come up
  • Constrain kube2iam to run on nodes only (needs some work that kops could help with there)
lifecyclrotten

Most helpful comment

I've specified tolerations for the DaemonSet in spec.template.spec to prevent, that kube2iam Scheduling on master nodes.

      tolerations:
        - key: "kubernetes.io/role"
          operator: "Equal"
          value: "master"
          effect: "NoSchedule"

All 11 comments

I am confused on what this issue is actually about... Please add more context.

@chrislovecnm I've added more information that hopefully clarifies the issue. @justinsb also has some other ideas about longer-term solutions for this as a whole -- bringing the kube2iam functionality into kops.

@chrislovecnm I asked jaygorrell to open this. We don't play well with kube2iam on the master because of dns-controller, but kube2iam works well and we should probably have an example manifest that e.g. excludes dns-controller

Any update on this?

I've specified tolerations for the DaemonSet in spec.template.spec to prevent, that kube2iam Scheduling on master nodes.

      tolerations:
        - key: "kubernetes.io/role"
          operator: "Equal"
          value: "master"
          effect: "NoSchedule"

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

/remove-lifecycle rotten
Looking into adding kube2iam to my cluster, but this seems like a less than ideal scenario. I don't think there is a way to add any annotations to the dns-controller deployment to give it access to R53 through kube2iam?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Was this page helpful?
0 / 5 - 0 ratings

Related issues

joshbranham picture joshbranham  路  3Comments

justinsb picture justinsb  路  4Comments

drewfisher314 picture drewfisher314  路  4Comments

minasys picture minasys  路  3Comments

mikejoh picture mikejoh  路  3Comments