kops 1.8.0-alpha.1
kubernetes: 1.7.8
provider: AWS
To reproduce it create a cluster and add and instance group with maxPrice value
Some instances not register with apiserver, some of them works fine so it seems some kind of race condition.
If you kill kubelet process, after respawn all works fine.
The relevant part of logs i could find in kubelet are this lines repeated in loop:
Oct 11 08:39:47 ip-172-20-104-75 kubelet[1281]: I1011 08:39:47.987315 1281 kubelet.go:1894] SyncLoop (DELETE, "api"): "kube-proxy-ip-172-20-104-75.eu-west-1.compute.internal_kube-system(b1664e35-ae5f-11e7-8c91-062a9a3fee2c)"
Oct 11 08:39:47 ip-172-20-104-75 kubelet[1281]: W1011 08:39:47.990681 1281 kubelet.go:1596] Deleting mirror pod "kube-proxy-ip-172-20-104-75.eu-west-1.compute.internal_kube-system(b1664e35-ae5f-11e7-8c91-062a9a3fee2c)" because it is outdated
Oct 11 08:39:47 ip-172-20-104-75 kubelet[1281]: I1011 08:39:47.990693 1281 mirror_client.go:85] Deleting a mirror pod "kube-proxy-ip-172-20-104-75.eu-west-1.compute.internal_kube-system"
Oct 11 08:39:47 ip-172-20-104-75 kubelet[1281]: I1011 08:39:47.994312 1281 kubelet.go:1888] SyncLoop (REMOVE, "api"): "kube-proxy-ip-172-20-104-75.eu-west-1.compute.internal_kube-system(b1664e35-ae5f-11e7-8c91-062a9a3fee2c)"
Oct 11 08:39:48 ip-172-20-104-75 kubelet[1281]: I1011 08:39:48.008283 1281 kubelet.go:1878] SyncLoop (ADD, "api"): "kube-proxy-ip-172-20-104-75.eu-west-1.compute.internal_kube-system(bd58ce30-ae5f-11e7-8c91-062a9a3fee2c)"
[...]
Cluster yaml
kind: Cluster
metadata:
creationTimestamp: 2017-10-11T08:54:21Z
name: test
spec:
api:
loadBalancer:
type: Public
authorization:
rbac: {}
channel: stable
cloudProvider: aws
configBase: s3://kubernetes-artifacts/test
dnsZone: test
etcdClusters:
- enableEtcdTLS: true
etcdMembers:
- instanceGroup: master-eu-west-1a
name: a
- instanceGroup: master-eu-west-1b
name: b
- instanceGroup: master-eu-west-1c
name: c
name: main
version: 3.1.10
- enableEtcdTLS: true
etcdMembers:
- instanceGroup: master-eu-west-1a
name: a
- instanceGroup: master-eu-west-1b
name: b
- instanceGroup: master-eu-west-1c
name: c
name: events
version: 3.1.10
iam:
legacy: false
kubeAPIServer:
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
kubelet:
featureGates:
ExperimentalCriticalPodAnnotation: "true"
kubernetesApiAccess:
- 0.0.0.0/0
kubernetesVersion: 1.7.8
masterInternalName: api.internal.test
masterPublicName: api.test
networkCIDR: 172.20.0.0/16
networking:
weave:
mtu: 8912
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
subnets:
- cidr: 172.20.32.0/19
name: eu-west-1a
type: Private
zone: eu-west-1a
- cidr: 172.20.64.0/19
name: eu-west-1b
type: Private
zone: eu-west-1b
- cidr: 172.20.96.0/19
name: eu-west-1c
type: Private
zone: eu-west-1c
- cidr: 172.20.0.0/22
name: utility-eu-west-1a
type: Utility
zone: eu-west-1a
- cidr: 172.20.4.0/22
name: utility-eu-west-1b
type: Utility
zone: eu-west-1b
- cidr: 172.20.8.0/22
name: utility-eu-west-1c
type: Utility
zone: eu-west-1c
topology:
bastion:
bastionPublicName: bastion.test
dns:
type: Public
masters: private
nodes: private
Instance group spotrequests
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: 2017-10-11T09:21:48Z
labels:
kops.k8s.io/cluster: test
name: ephimeral
spec:
image: kope.io/k8s-1.7-debian-jessie-amd64-hvm-ebs-2017-07-28
machineType: m4.2xlarge
maxSize: 7
minSize: 3
maxPrice: "0.15"
role: Node
subnets:
- eu-west-1b
- eu-west-1c
- eu-west-1a
After some deep analysis it seems to be a race condition running kubelet before AWS tags are available which it is happening with spot instances but maybe could happen also with on-demand instances.
1321 tags.go:94] Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.
1321 tags.go:78] AWS cloud - no clusterID filtering applied for shared resources; do not run multiple clusters in this AZ.
Same for me after moving to image 2017-12-02 (both 1.7 and 1.8). 40% of started spot instances not added to the cluster.
I have this workaround in the cluster spec that seems to work
hooks:
- manifest: |
Type=oneshot
ExecStart=/usr/bin/docker run --net host quay.io/sergioballesteros/check-aws-tags
ExecStartPost=/bin/systemctl restart kubelet.service"
name: ensure-aws-tags.service
requires:
- docker.service
roles:
- Node
@ese thank's for the workaround! It works.
@ese nice find, any ideas on what we should do for the race condition? Not familiar with check-aws-tags
@chrislovecnm It runs a simple sh script which waits for ec2 tags and restarts kubelet than:
#!/bin/bash -x
function check_tags {
aws ec2 describe-instances --region $(curl -m 10 http://169.254.169.254/latest/dynamic/instance-identity/document|grep region|awk -F\" '{print $4}') --instance-ids $(curl -m 10 http://169.254.169.254/latest/meta-data/instance-id) | grep KubernetesCluster
}
until check_tags
do
sleep 1s
echo no tags
done
sleep 1s
echo FINISH!
We probably need that in protokube
This should probably be fixed by k8s core, not kops per kubernetes/kubernetes#57382
@hubt agreed, but how are we suddenly having this problem? And I have not been able to recreate as well.
I've had this problem for a long time, my 1.4 cluster has done this forever, but i never investigated it and pinpointed it until i saw this. It's feels like it's most problematic when there's a lot of node churn, like a big sweep of spots takes out and then replaces a lot of nodes at once.
Well let me disagree, I think this may be handled best if we put it in the installer and not in k8s. Technically the node is not ready for k8s to be installed.
I think that's fine. If assumption is that only kops knows about spot vs on-demand instances and kops should take care of all the nuanced differences, then putting into protokube makes sense.
Unfortunately the workaround that @ese posted does not work for me. My spot instances still fail to register about half the time during a rolling update. I'm not sure if the root cause of my issue is with tagging (could be, just don't know for sure), but restarting kubelet manually during a rolling update fixes it for me. Not ideal...
Edit: There's what appears to be a typo in the workaround above (there's a " after the service name that I think shouldn't be there). Removing the " seems to have improved (but not totally resolved) the issue. On my five node test cluster only one failed to come back up during a rolling update. Maybe I just got lucky that time.
Edit 2: I've done a few more rolling updates since the comment and it seems like workaround IS working for me after removing the typo. I have only seen one failure in rolling updates since applying the non-typo workaround, and in that case it was something very different (kubelet wasn't even installed on the node after 10 minutes, no idea what went wrong there and haven't seen it before).
It may be a bit crude since it doesn't depend on checking the AWS API for node tags but right after the line:
set -o pipefail we added sleep 2m
because in general the tags will be present on the node after about 2 minutes. This is a decent (albeit naive) workaround for us at least until https://github.com/kubernetes/kubernetes/issues/57382 is taken care of.
The script @ese provided didn't work for me, but then I found 2 small errors in the hook. After fixing those it seems to work nicely. The hook should have been:
hooks:
- manifest: |
Type=oneshot
ExecStart=/usr/bin/docker run --net host quay.io/sergioballesteros/check-aws-tags
ExecStartPost=/bin/systemctl restart kubelet.service
name: ensure-aws-tags.service
requires:
- docker.service
roles:
- Node
The ExecStartPost had a trailing " which wasn't supposed to be there, and the hook name shouldn't be part of the manifest, but part of the actual hook. Otherwise the service doesn't register properly and doesn't work.
I think this should be fixed in kubernetes 1.10 kubernetes/kubernetes#60125 if anyone can test it
Seems to be cherry-picked to latest 1.9.7
@ese @chrislovecnm could you provide the Dockerfile for this https://github.com/kubernetes/kops/issues/3605#issuecomment-351674234
@alok87
It is basically an image with curl and awscli running this bash script
#!/bin/bash -x
function check_tags {
aws ec2 describe-instances --region $(curl -m 10 http://169.254.169.254/latest/dynamic/instance-identity/document|grep region|awk -F\" '{print $4}') --instance-ids $(curl -m 10 http://169.254.169.254/latest/meta-data/instance-id) | grep KubernetesCluster
}
until check_tags
do
sleep 1s
echo no tags
done
sleep 1s
echo FINISH!
Added --restart no to hook for fix issue #64507
hooks:
- manifest: |
Type=oneshot
ExecStart=/usr/bin/docker run --net host --restart no quay.io/sergioballesteros/check-aws-tags
ExecStartPost=/bin/systemctl restart kubelet.service
name: ensure-aws-tags.service
requires:
- docker.service
roles:
- Node
For this issue, are we waiting on the upstream Kubernetes fix?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
hi, is this issue fixed in kubernete 1.10 ?
@ericchiang yep this is already fixed upstream in 1.10 https://github.com/kubernetes/kubernetes/pull/60125
Also cherry-picked to 1.8 and 1.9
https://github.com/kubernetes/kubernetes/pull/61138
https://github.com/kubernetes/kubernetes/pull/61136
Most helpful comment
The script @ese provided didn't work for me, but then I found 2 small errors in the hook. After fixing those it seems to work nicely. The hook should have been:
The
ExecStartPosthad a trailing"which wasn't supposed to be there, and the hooknameshouldn't be part of the manifest, but part of the actual hook. Otherwise the service doesn't register properly and doesn't work.