Autoscaler: after upgrading my EKS cluster from 14 to 15 I got this problem

Created on 13 Apr 2020 · 5Comments · Source: kubernetes/autoscaler

after upgrading my EKS cluster from 14 to 15 I got this problem
I gave this command
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=k8s.gcr.io/cluster-autoscaler:v1.15.6

Name: cluster-autoscaler-65b54d4975-g7w6w
Namespace: kube-system
Priority: 0
PriorityClassName:
Node: ip-192-168-87-254.us-east-2.compute.internal/192.168.87.254
Start Time: Mon, 13 Apr 2020 15:02:59 +0000
Labels: app=cluster-autoscaler
pod-template-hash=65b54d4975
Annotations: kubernetes.io/psp: eks.privileged
prometheus.io/port: 8085
prometheus.io/scrape: true
Status: Pending
IP: 192.168.78.162
Controlled By: ReplicaSet/cluster-autoscaler-65b54d4975
Containers:
cluster-autoscaler:
Container ID:
Image: k8s.gcr.io/cluster-autoscaler:v1.15.6
Image ID:
Port:
Host Port:
Command:
./cluster-autoscaler
--v=4
--stderrthreshold=info
--cloud-provider=aws
--skip-nodes-with-local-storage=false
--expander=least-waste
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/
--balance-similar-node-groups
--skip-nodes-with-system-pods=false
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Limits:
cpu: 100m
memory: 300Mi
Requests:
cpu: 100m
memory: 300Mi
Environment:
Mounts:
/etc/ssl/certs/ca-certificates.crt from ssl-certs (ro)
/var/run/secrets/kubernetes.io/serviceaccount from cluster-autoscaler-token-cr2d2 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
ssl-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs/ca-bundle.crt
HostPathType:
cluster-autoscaler-token-cr2d2:
Type: Secret (a volume populated by a Secret)
SecretName: cluster-autoscaler-token-cr2d2
Optional: false
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m26s default-scheduler Successfully assigned kube-system/cluster-autoscaler-65b54d4975-g7w6w to ip-192-168-87-254.us-east-2.compute.internal
Normal Pulling 52s (x4 over 2m25s) kubelet, ip-192-168-87-254.us-east-2.compute.internal Pulling image "k8s.gcr.io/cluster-autoscaler:v1.15.6"
Warning Failed 52s (x4 over 2m25s) kubelet, ip-192-168-87-254.us-east-2.compute.internal Failed to pull image "k8s.gcr.io/cluster-autoscaler:v1.15.6": rpc error: code = Unknown desc = Error response from daemon: manifest for k8s.gcr.io/cluster-autoscaler:v1.15.6 not found
Warning Failed 52s (x4 over 2m25s) kubelet, ip-192-168-87-254.us-east-2.compute.internal Error: ErrImagePull
Normal BackOff 41s (x6 over 2m24s) kubelet, ip-192-168-87-254.us-east-2.compute.internal Back-off pulling image "k8s.gcr.io/cluster-autoscaler:v1.15.6"
Warning Failed 26s (x7 over 2m24s) kubelet, ip-192-168-87-254.us-east-2.compute.internal Error: ImagePullBackOff

lifecyclrotten

Source

paramacharya

Most helpful comment

1.15.6 is currently not available via k8s.gcr.io. This is because of Kubernetes-wide image repo migration (https://github.com/kubernetes/k8s.io/blob/master/k8s.gcr.io/Vanity-Domain-Flip.md). New versions of all Kubernetes components are already hosted from new gcr, however, k8s.gcr.io was not flipped yet. Eventually it will be flipped to new repo and k8s.gcr.io/cluster-autoscaler:v1.15.6 will start working, but for now you need to set it to us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.15.6 as per 1.15.6 release notes (same goes for all latest patches of CA).

MaciekPytel on 14 Apr 2020

👍7

All 5 comments

MaciekPytel on 14 Apr 2020

👍7

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 13 Jul 2020

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 12 Aug 2020

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 11 Sep 2020

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.