I'm using:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:33:11Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:22:08Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
What I did:
What I expected:
What happened:
Suggestion:
Thank you for the report. We are aware of this and we thinking about the best approach to solve it.
cc: @MaciekPytel @fgrzadkowski
@adamrp It will be possible in the new version. However you will have to create a PodDisruptionBudget for all of your system pods like dns, dashboard, heapster (evertyhin other than daemonset/kube-proxy/etc).
Any best practices PodDisruptionBudget for kube-system? Thanks.
The node below should be scaled down.
Name: gke-aaa-xxx-yyy-zzz-po-62b92cb1-0256
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/fluentd-ds-ready=true
beta.kubernetes.io/instance-type=n1-highcpu-4
beta.kubernetes.io/os=linux
cloud.google.com/gke-nodepool=xxx-staging-pool
failure-domain.beta.kubernetes.io/region=asia-northeast1
failure-domain.beta.kubernetes.io/zone=asia-northeast1-c
kubernetes.io/hostname=gke-aaa-xxx-yyy-zzz-po-62b92cb1-0256
name=stg-node
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp: Tue, 31 Jul 2018 12:20:14 +0900
Taints: <none>
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
KernelDeadlock False Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:12 +0900 KernelHasNoDeadlock kernel has no deadlock
NetworkUnavailable False Tue, 31 Jul 2018 12:20:15 +0900 Tue, 31 Jul 2018 12:20:15 +0900 RouteCreated NodeController create implicit route
OutOfDisk False Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:14 +0900 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:14 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:14 +0900 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:14 +0900 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 31 Jul 2018 15:42:09 +0900 Tue, 31 Jul 2018 12:20:34 +0900 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 10.22.0.52
ExternalIP:
Hostname: gke-aaa-xxx-yyy-zzz-po-62b92cb1-0256
Capacity:
cpu: 4
ephemeral-storage: 36937420Ki
hugepages-2Mi: 0
memory: 3631468Ki
pods: 110
Allocatable:
cpu: 3920m
ephemeral-storage: 12566689736
hugepages-2Mi: 0
memory: 2585964Ki
pods: 110
System Info:
Machine ID: a04db0d2e5ef3959f42ff7743ae7e79e
System UUID: A04DB0D2-E5EF-3959-F42F-F7743AE7E79E
Boot ID: a68a519f-e2d3-47e5-9981-932cc5b52cef
Kernel Version: 4.14.22+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.2
Kubelet Version: v1.10.4-gke.2
Kube-Proxy Version: v1.10.4-gke.2
PodCIDR: 10.48.3.0/24
ProviderID: gce://xxxx-xxxx-xxxx/xxxx-xxxx-c/gke-aaa-xxx-yyy-zzz-po-62b92cb1-0256
Non-terminated Pods: (5 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system gke-aaa-xxx-yyy-zzz-po-62b92cb1-0256 100m (2%) 0 (0%) 0 (0%) 0 (0%)
kube-system metadata-agent-xxbk8 40m (1%) 0 (0%) 50Mi (1%) 0 (0%)
kube-system metrics-server-v0.2.1-7c88c7f9-zjsxf 57m (1%) 152m (3%) 186Mi (7%) 436Mi (17%)
logging fluentd-colopl-hq74b 100m (2%) 0 (0%) 800Mi (31%) 800Mi (31%)
monitoring node-exporter-r842k 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 297m (7%) 152m (3%)
memory 1036Mi (41%) 1236Mi (48%)
Events: <none>
Any best practices PodDisruptionBudget for kube-system?
Only add them if you're sure those pods can be safely evicted :) E.g. creating a PDB for kube-dns with minAvailable > 50% should be OK in most cases. However, evicting all kube-dns pods at the time can cause networking problems. In case of metrics-server or heapster, there's only one replica, so a PDB for it will always cause the risk of downtime (loss of metrics) for a couple of minutes. It really depends on your setup whether it's acceptable or not.
In this particular case your node is running metrics-server. Restarting it would prevent all HPAs in your cluster from autoscaling (and make them set error status, emit error events, etc) for a few minutes. As @aleksandra-malinowska wrote it's really your decision if it's acceptable or not.
In general kube-dns is the only system pod that comes to my mind that has multiple replicas. Every other system pod is a singleton and restarting it will likely cause some kind of temporary disruption in cluster. If it's a test cluster it may be ok to just create a PDB for every kube-system pod. If it's production you probably need to make case-by-case decision based on which services are critical for your workloads.
@aleksandra-malinowska @MaciekPytel Thank you for so detailed explanation. Maybe the best way is set non-PDB for kube-system in my case.
Could I ask one more question? I noticed in FAQ What types of pods can prevent CA from removing a node?
Pods with restrictive PodDisruptionBudget.
Kube-system pods that:
are not run on the node by default, *
don't have PDB or their PDB is too restrictive (since CA 0.6).
Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc). *
Pods with local storage. *`
What does local storage.* actually meaning? Does hostPath, emptyDir belongs it?
It does. If you have a pod that uses local storage, but you want to allow CA to move it around you can set "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" annotation on it.
Most helpful comment
It does. If you have a pod that uses local storage, but you want to allow CA to move it around you can set "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" annotation on it.