
Describe the bug
Pods that have died to lack of CPU or Memory and are evicted get the status of pending in k9s.
To Reproduce
Expected behavior
I believe that the expected behavior would be that these pods would be shown as Evicted/Failed
Screenshots

Versions (please complete the following information):
@brunohms Actually this pod status is correct. The status reported by describe is the node status indicating your cluster is at or over capacity, hence the pod status is in pending state. I believe you will get the same reported pod status from kubectl as this denote a scheduling issue and not a pod issue.
Hi @derailed I'm afraid that what you said is incorrect, when I describe the pod itself I get the following information:
Name: proprietary-crawler-preemptive-79f7cbc6cf-rzck9
Namespace: proprietary
Priority: 0
PriorityClassName: <none>
Node: gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w/
Start Time: Fri, 26 Apr 2019 03:55:55 -0300
Labels: app.kubernetes.io/instance=proprietary-crawler
app.kubernetes.io/name=proprietary-crawler
date=1544806882
pod-template-hash=3593767279
Annotations: <none>
Status: Failed
Reason: Evicted
Message: The node was low on resource: memory. Container proprietary-crawler was using 9402820Ki, which exceeds its request of 7G.
IP:
Controlled By: ReplicaSet/proprietary-crawler-preemptive-79f7cbc6cf
Containers:
proprietary-crawler:
Image: <redacted>
Port: 3333/TCP
Host Port: 0/TCP
Limits:
cpu: 8
memory: 10G
Requests:
cpu: 4
memory: 7G
Liveness: http-get http://:3333/health/proprietary-crawler/health delay=60s timeout=1s period=3s #success=1 #failure=5
Readiness: http-get http://:3333/health/proprietary-crawler/health delay=60s timeout=1s period=3s #success=1 #failure=3
Environment:
s_env: prod
JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Xms4g -Xmx8g
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-cwcqv (ro)
Volumes:
default-token-cwcqv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-cwcqv
Optional: false
QoS Class: Burstable
Node-Selectors: environment=production
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
When I describe the node that it was it shows the following information:
Name: gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/fluentd-ds-ready=true
beta.kubernetes.io/instance-type=custom-40-96256
beta.kubernetes.io/masq-agent-ds-ready=true
beta.kubernetes.io/os=linux
cloud.google.com/gke-nodepool=preemptible-40c94gb-pool
cloud.google.com/gke-os-distribution=cos
cloud.google.com/gke-preemptible=true
environment=production
failure-domain.beta.kubernetes.io/region=us-east1
failure-domain.beta.kubernetes.io/zone=us-east1-b
kubernetes.io/hostname=gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Annotations: container.googleapis.com/instance_id: 6727367394293916404
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Fri, 26 Apr 2019 02:50:59 -0300
Taints: <none>
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
FrequentDockerRestart False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:56:00 -0300 FrequentDockerRestart docker is functioning properly
FrequentContainerdRestart False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:56:01 -0300 FrequentContainerdRestart containerd is functioning properly
CorruptDockerOverlay2 False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:55:59 -0300 CorruptDockerOverlay2 docker overlay2 is functioning properly
KernelDeadlock False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:50:58 -0300 KernelHasNoDeadlock kernel has no deadlock
ReadonlyFilesystem False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:50:58 -0300 FilesystemIsNotReadOnly Filesystem is not read-only
FrequentUnregisterNetDevice False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:55:59 -0300 UnregisterNetDevice node is functioning properly
FrequentKubeletRestart False Fri, 26 Apr 2019 10:17:21 -0300 Fri, 26 Apr 2019 02:55:59 -0300 FrequentKubeletRestart kubelet is functioning properly
NetworkUnavailable False Fri, 26 Apr 2019 02:50:59 -0300 Fri, 26 Apr 2019 02:50:59 -0300 RouteCreated NodeController create implicit route
OutOfDisk False Fri, 26 Apr 2019 10:18:10 -0300 Fri, 26 Apr 2019 02:50:59 -0300 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 26 Apr 2019 10:18:10 -0300 Fri, 26 Apr 2019 09:17:05 -0300 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 26 Apr 2019 10:18:10 -0300 Fri, 26 Apr 2019 02:50:59 -0300 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Fri, 26 Apr 2019 10:18:10 -0300 Fri, 26 Apr 2019 02:50:59 -0300 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Fri, 26 Apr 2019 10:18:10 -0300 Fri, 26 Apr 2019 02:51:19 -0300 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: <redacted>
ExternalIP: <redacted>
Hostname: gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Capacity:
cpu: 40
ephemeral-storage: 202086868Ki
hugepages-2Mi: 0
memory: 96935564Ki
pods: 110
Allocatable:
cpu: 39830m
ephemeral-storage: 104638878617
hugepages-2Mi: 0
memory: 89240204Ki
pods: 110
System Info:
Machine ID: 110bcec7066cebd9701e7a2a40799178
System UUID: 110BCEC7-066C-EBD9-701E-7A2A40799178
Boot ID: 95524bff-78d4-40cc-9e64-33b6f526d263
Kernel Version: 4.14.91+
OS Image: Container-Optimized OS from Google
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.2
Kubelet Version: v1.11.8-gke.6
Kube-Proxy Version: v1.11.8-gke.6
PodCIDR: <redacted>
ProviderID: <redacted>
Non-terminated Pods: (38 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
coleta-tech coleta-tech-74fc95db7d-n6zc6 1 (2%) 2 (5%) 500Mi (0%) 1200Mi (1%) 7h26m
crawler crawler-6f6899664b-dnjbd 200m (0%) 450m (1%) 4Gi (4%) 5Gi (5%) 7h26m
crawler crawler-6f6899664b-q28gn 200m (0%) 450m (1%) 4Gi (4%) 5Gi (5%) 7h26m
crawler crawler-5489f7f79c-7cswb 200m (0%) 450m (1%) 2500Mi (2%) 3Gi (3%) 7h26m
crawler crawler-59b5c7f54-7zhj2 200m (0%) 450m (1%) 4Gi (4%) 5Gi (5%) 7h26m
crawler crawler-6446f48b75-scp9g 100m (0%) 300m (0%) 1500Mi (1%) 3Gi (3%) 7h26m
crawler crawler-79695cdc7b-sl8q6 200m (0%) 450m (1%) 4Gi (4%) 5Gi (5%) 7h26m
crawler crawler-7bdd55ccc7-w2gjk 200m (0%) 450m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-795746ff4b-6wxjs 200m (0%) 600m (1%) 2Gi (2%) 3500Mi (4%) 7h26m
crawler crawler-795746ff4b-qkrjj 200m (0%) 600m (1%) 2Gi (2%) 3500Mi (4%) 7h26m
crawler crawler-6855b47c8b-wt6q2 200m (0%) 400m (1%) 2500Mi (2%) 3Gi (3%) 7h26m
crawler crawler-5d6b6df9fc-bmsq4 200m (0%) 500m (1%) 2500Mi (2%) 3500Mi (4%) 7h26m
crawler crawler-74f4f64858-vz6m4 200m (0%) 800m (2%) 3Gi (3%) 3500Mi (4%) 7h26m
crawler crawler-858b68c845-hq4h4 100m (0%) 500m (1%) 2500Mi (2%) 4Gi (4%) 7h26m
crawler crawler-58878c45bc-9956h 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-58878c45bc-lnp46 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-58878c45bc-m5p5q 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 6h22m
crawler crawler-58878c45bc-pq9wl 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-58878c45bc-sbdt4 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-58878c45bc-sth5b 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 7h26m
crawler crawler-58878c45bc-tlc87 200m (0%) 500m (1%) 2500Mi (2%) 3200Mi (3%) 6h22m
crawler crawler-7cb474c9b5-2jnd8 1 (2%) 1500m (3%) 1500Mi (1%) 2500Mi (2%) 7h26m
crawler crawler-7cb474c9b5-7vj7z 1 (2%) 1500m (3%) 1500Mi (1%) 2500Mi (2%) 6h22m
crawler crawler-86db45b896-lx9h7 500m (1%) 1 (2%) 3Gi (3%) 5Gi (5%) 7h26m
crawler crawler-964d6cd8b-dlgqm 100m (0%) 300m (0%) 2Gi (2%) 3Gi (3%) 7h26m
crawler crawler-6db584dd48-jr5rv 100m (0%) 300m (0%) 2Gi (2%) 3Gi (3%) 6h22m
fetcher fetcher-6bc765666b-twqv9 500m (1%) 2500m (6%) 3500Mi (4%) 4Gi (4%) 6h22m
fetcher fetcher-6bbd4d7f48-frmtm 200m (0%) 800m (2%) 2Gi (2%) 3Gi (3%) 6h22m
fetcher fetcher-544fb5c477-69675 500m (1%) 1 (2%) 4Gi (4%) 7Gi (8%) 6h22m
fetcher fetcher-5864748cb9-qrh99 100m (0%) 500m (1%) 1500Mi (1%) 2300Mi (2%) 6h22m
knative-monitoring fluentd-ds-52czk 100m (0%) 0 (0%) 200Mi (0%) 500Mi (0%) 7h27m
knative-monitoring node-exporter-sqf5k 110m (0%) 220m (0%) 50Mi (0%) 90Mi (0%) 7h27m
kube-system fluentd-gcp-v3.1.1-n9jfb 100m (0%) 1 (2%) 200Mi (0%) 500Mi (0%) 7h27m
kube-system ip-masq-agent-582xf 10m (0%) 0 (0%) 16Mi (0%) 0 (0%) 7h27m
kube-system kube-proxy-gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w 100m (0%) 0 (0%) 0 (0%) 0 (0%) 7h26m
kube-system metadata-agent-94dvj 40m (0%) 0 (0%) 50Mi (0%) 0 (0%) 7h27m
kube-system prometheus-to-sd-z7rd2 1m (0%) 3m (0%) 20Mi (0%) 20Mi (0%) 7h27m
queue queue-76897cdf4b-8phxh 2 (5%) 3 (7%) 3G (3%) 5G (5%) 6h22m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 11261m (28%) 25523m (64%)
memory 84159782400 (92%) 118877450752 (130%)
ephemeral-storage 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning OOMKilling 6m58s kernel-monitor, gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w Memory cgroup out of memory: Kill process 9015 (java) score 1843 or sacrifice child
Killed process 9015 (java) total-vm:32095856kB, anon-rss:2708576kB, file-rss:7580kB, shmem-rss:0kB
Obs: Had to redact a few info due to containing info about our clients.
If I get the pods from that namespace this pod is shown as:
NAME READY STATUS RESTARTS AGE
proprietary-crawler-preemptive-79f7cbc6cf-rzck9 0/1 Evicted 0 6h
Obs: Running pods are omitted
I'm finding hard to find anywhere that says this pod is pending, when in fact it has been evicted by the kubernetes due to, as you mentioned, lack of resource in the node.
@brunohms Thank you so much for the detailed report! I stand corrected. You are right, I was thinking your pod was not getting scheduled hence pending status, however in your latest scenario it looks like the pod was indeed evicted from the node, which K9s currently does not track. I will fix this on the next push. Thank you for the correction!
@brunohms Fixed 0.5.2!
Most helpful comment
@brunohms Fixed 0.5.2!