K9s: Pods being show as Pending when they are in other failed status

Created on 25 Apr 2019 · 4Comments · Source: derailed/k9s

Describe the bug
Pods that have died to lack of CPU or Memory and are evicted get the status of pending in k9s.

To Reproduce

Have a pod be killed by lack of CPU/Memory
Filter it in k9s

Expected behavior
I believe that the expected behavior would be that these pods would be shown as Evicted/Failed

Screenshots

Versions (please complete the following information):

OS: ubuntu 18.04
K9s Version: dev
Commit: dev
Date: n/a
K8s Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.8-gke.6", GitCommit:"394ee507d00f15a63cef577a14026096c310698e", GitTreeState:"clean", BuildDate:"2019-03-30T19:31:43Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}

bug

Source

brunohms

👍1

Most helpful comment

@brunohms Fixed 0.5.2!

derailed on 27 Apr 2019

👍2

All 4 comments

@brunohms Actually this pod status is correct. The status reported by describe is the node status indicating your cluster is at or over capacity, hence the pod status is in pending state. I believe you will get the same reported pod status from kubectl as this denote a scheduling issue and not a pod issue.

derailed on 26 Apr 2019

Hi @derailed I'm afraid that what you said is incorrect, when I describe the pod itself I get the following information:

Name:               proprietary-crawler-preemptive-79f7cbc6cf-rzck9
Namespace:          proprietary
Priority:           0
PriorityClassName:  <none>
Node:               gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w/
Start Time:         Fri, 26 Apr 2019 03:55:55 -0300
Labels:             app.kubernetes.io/instance=proprietary-crawler
                    app.kubernetes.io/name=proprietary-crawler
                    date=1544806882
                    pod-template-hash=3593767279
Annotations:        <none>
Status:             Failed
Reason:             Evicted
Message:            The node was low on resource: memory. Container proprietary-crawler was using 9402820Ki, which exceeds its request of 7G. 
IP:                 
Controlled By:      ReplicaSet/proprietary-crawler-preemptive-79f7cbc6cf
Containers:
  proprietary-crawler:
    Image:      <redacted>
    Port:       3333/TCP
    Host Port:  0/TCP
    Limits:
      cpu:     8
      memory:  10G
    Requests:
      cpu:      4
      memory:   7G
    Liveness:   http-get http://:3333/health/proprietary-crawler/health delay=60s timeout=1s period=3s #success=1 #failure=5
    Readiness:  http-get http://:3333/health/proprietary-crawler/health delay=60s timeout=1s period=3s #success=1 #failure=3
    Environment:
      s_env:              prod
      JAVA_TOOL_OPTIONS:  -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Xms4g -Xmx8g
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-cwcqv (ro)
Volumes:
  default-token-cwcqv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-cwcqv
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  environment=production
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

When I describe the node that it was it shows the following information:

Name:               gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/fluentd-ds-ready=true
                    beta.kubernetes.io/instance-type=custom-40-96256
                    beta.kubernetes.io/masq-agent-ds-ready=true
                    beta.kubernetes.io/os=linux
                    cloud.google.com/gke-nodepool=preemptible-40c94gb-pool
                    cloud.google.com/gke-os-distribution=cos
                    cloud.google.com/gke-preemptible=true
                    environment=production
                    failure-domain.beta.kubernetes.io/region=us-east1
                    failure-domain.beta.kubernetes.io/zone=us-east1-b
                    kubernetes.io/hostname=gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Annotations:        container.googleapis.com/instance_id: 6727367394293916404
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 26 Apr 2019 02:50:59 -0300
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                          Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                          ------  -----------------                 ------------------                ------                       -------
  FrequentDockerRestart         False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:56:00 -0300   FrequentDockerRestart        docker is functioning properly
  FrequentContainerdRestart     False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:56:01 -0300   FrequentContainerdRestart    containerd is functioning properly
  CorruptDockerOverlay2         False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:55:59 -0300   CorruptDockerOverlay2        docker overlay2 is functioning properly
  KernelDeadlock                False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:50:58 -0300   KernelHasNoDeadlock          kernel has no deadlock
  ReadonlyFilesystem            False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:50:58 -0300   FilesystemIsNotReadOnly      Filesystem is not read-only
  FrequentUnregisterNetDevice   False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:55:59 -0300   UnregisterNetDevice          node is functioning properly
  FrequentKubeletRestart        False   Fri, 26 Apr 2019 10:17:21 -0300   Fri, 26 Apr 2019 02:55:59 -0300   FrequentKubeletRestart       kubelet is functioning properly
  NetworkUnavailable            False   Fri, 26 Apr 2019 02:50:59 -0300   Fri, 26 Apr 2019 02:50:59 -0300   RouteCreated                 NodeController create implicit route
  OutOfDisk                     False   Fri, 26 Apr 2019 10:18:10 -0300   Fri, 26 Apr 2019 02:50:59 -0300   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure                False   Fri, 26 Apr 2019 10:18:10 -0300   Fri, 26 Apr 2019 09:17:05 -0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure                  False   Fri, 26 Apr 2019 10:18:10 -0300   Fri, 26 Apr 2019 02:50:59 -0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure                   False   Fri, 26 Apr 2019 10:18:10 -0300   Fri, 26 Apr 2019 02:50:59 -0300   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                         True    Fri, 26 Apr 2019 10:18:10 -0300   Fri, 26 Apr 2019 02:51:19 -0300   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  <redacted>
  ExternalIP:  <redacted>
  Hostname:    gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w
Capacity:
 cpu:                40
 ephemeral-storage:  202086868Ki
 hugepages-2Mi:      0
 memory:             96935564Ki
 pods:               110
Allocatable:
 cpu:                39830m
 ephemeral-storage:  104638878617
 hugepages-2Mi:      0
 memory:             89240204Ki
 pods:               110
System Info:
 Machine ID:                 110bcec7066cebd9701e7a2a40799178
 System UUID:                110BCEC7-066C-EBD9-701E-7A2A40799178
 Boot ID:                    95524bff-78d4-40cc-9e64-33b6f526d263
 Kernel Version:             4.14.91+
 OS Image:                   Container-Optimized OS from Google
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.3.2
 Kubelet Version:            v1.11.8-gke.6
 Kube-Proxy Version:         v1.11.8-gke.6
PodCIDR:                     <redacted>
ProviderID:                  <redacted>
Non-terminated Pods:         (38 in total)
  Namespace                  Name                                                             CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                                             ------------  ----------  ---------------  -------------  ---
  coleta-tech                coleta-tech-74fc95db7d-n6zc6                                     1 (2%)        2 (5%)      500Mi (0%)       1200Mi (1%)    7h26m
  crawler                    crawler-6f6899664b-dnjbd                                         200m (0%)     450m (1%)   4Gi (4%)         5Gi (5%)       7h26m
  crawler                    crawler-6f6899664b-q28gn                                         200m (0%)     450m (1%)   4Gi (4%)         5Gi (5%)       7h26m
  crawler                    crawler-5489f7f79c-7cswb                                         200m (0%)     450m (1%)   2500Mi (2%)      3Gi (3%)       7h26m
  crawler                    crawler-59b5c7f54-7zhj2                                          200m (0%)     450m (1%)   4Gi (4%)         5Gi (5%)       7h26m
  crawler                    crawler-6446f48b75-scp9g                                         100m (0%)     300m (0%)   1500Mi (1%)      3Gi (3%)       7h26m
  crawler                    crawler-79695cdc7b-sl8q6                                         200m (0%)     450m (1%)   4Gi (4%)         5Gi (5%)       7h26m
  crawler                    crawler-7bdd55ccc7-w2gjk                                         200m (0%)     450m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-795746ff4b-6wxjs                                         200m (0%)     600m (1%)   2Gi (2%)         3500Mi (4%)    7h26m
  crawler                    crawler-795746ff4b-qkrjj                                         200m (0%)     600m (1%)   2Gi (2%)         3500Mi (4%)    7h26m
  crawler                    crawler-6855b47c8b-wt6q2                                         200m (0%)     400m (1%)   2500Mi (2%)      3Gi (3%)       7h26m
  crawler                    crawler-5d6b6df9fc-bmsq4                                         200m (0%)     500m (1%)   2500Mi (2%)      3500Mi (4%)    7h26m
  crawler                    crawler-74f4f64858-vz6m4                                         200m (0%)     800m (2%)   3Gi (3%)         3500Mi (4%)    7h26m
  crawler                    crawler-858b68c845-hq4h4                                         100m (0%)     500m (1%)   2500Mi (2%)      4Gi (4%)       7h26m
  crawler                    crawler-58878c45bc-9956h                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-58878c45bc-lnp46                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-58878c45bc-m5p5q                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    6h22m
  crawler                    crawler-58878c45bc-pq9wl                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-58878c45bc-sbdt4                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-58878c45bc-sth5b                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    7h26m
  crawler                    crawler-58878c45bc-tlc87                                         200m (0%)     500m (1%)   2500Mi (2%)      3200Mi (3%)    6h22m
  crawler                    crawler-7cb474c9b5-2jnd8                                         1 (2%)        1500m (3%)  1500Mi (1%)      2500Mi (2%)    7h26m
  crawler                    crawler-7cb474c9b5-7vj7z                                         1 (2%)        1500m (3%)  1500Mi (1%)      2500Mi (2%)    6h22m
  crawler                    crawler-86db45b896-lx9h7                                         500m (1%)     1 (2%)      3Gi (3%)         5Gi (5%)       7h26m
  crawler                    crawler-964d6cd8b-dlgqm                                          100m (0%)     300m (0%)   2Gi (2%)         3Gi (3%)       7h26m
  crawler                    crawler-6db584dd48-jr5rv                                         100m (0%)     300m (0%)   2Gi (2%)         3Gi (3%)       6h22m
  fetcher                    fetcher-6bc765666b-twqv9                                         500m (1%)     2500m (6%)  3500Mi (4%)      4Gi (4%)       6h22m
  fetcher                    fetcher-6bbd4d7f48-frmtm                                         200m (0%)     800m (2%)   2Gi (2%)         3Gi (3%)       6h22m
  fetcher                    fetcher-544fb5c477-69675                                         500m (1%)     1 (2%)      4Gi (4%)         7Gi (8%)       6h22m
  fetcher                    fetcher-5864748cb9-qrh99                                         100m (0%)     500m (1%)   1500Mi (1%)      2300Mi (2%)    6h22m
  knative-monitoring         fluentd-ds-52czk                                                 100m (0%)     0 (0%)      200Mi (0%)       500Mi (0%)     7h27m
  knative-monitoring         node-exporter-sqf5k                                              110m (0%)     220m (0%)   50Mi (0%)        90Mi (0%)      7h27m
  kube-system                fluentd-gcp-v3.1.1-n9jfb                                         100m (0%)     1 (2%)      200Mi (0%)       500Mi (0%)     7h27m
  kube-system                ip-masq-agent-582xf                                              10m (0%)      0 (0%)      16Mi (0%)        0 (0%)         7h27m
  kube-system                kube-proxy-gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w    100m (0%)     0 (0%)      0 (0%)           0 (0%)         7h26m
  kube-system                metadata-agent-94dvj                                             40m (0%)      0 (0%)      50Mi (0%)        0 (0%)         7h27m
  kube-system                prometheus-to-sd-z7rd2                                           1m (0%)       3m (0%)     20Mi (0%)        20Mi (0%)      7h27m
  queue                      queue-76897cdf4b-8phxh                                           2 (5%)        3 (7%)      3G (3%)          5G (5%)        6h22m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests           Limits
  --------           --------           ------
  cpu                11261m (28%)       25523m (64%)
  memory             84159782400 (92%)  118877450752 (130%)
  ephemeral-storage  0 (0%)             0 (0%)
Events:
  Type     Reason      Age    From                                                                Message
  ----     ------      ----   ----                                                                -------
  Warning  OOMKilling  6m58s  kernel-monitor, gke-general-preemptible-40c94gb-pool-35dd7b4d-mv7w  Memory cgroup out of memory: Kill process 9015 (java) score 1843 or sacrifice child
Killed process 9015 (java) total-vm:32095856kB, anon-rss:2708576kB, file-rss:7580kB, shmem-rss:0kB

Obs: Had to redact a few info due to containing info about our clients.

If I get the pods from that namespace this pod is shown as:

NAME                                              READY   STATUS    RESTARTS   AGE
proprietary-crawler-preemptive-79f7cbc6cf-rzck9   0/1     Evicted   0          6h

Obs: Running pods are omitted

I'm finding hard to find anywhere that says this pod is pending, when in fact it has been evicted by the kubernetes due to, as you mentioned, lack of resource in the node.

brunohms on 26 Apr 2019

@brunohms Thank you so much for the detailed report! I stand corrected. You are right, I was thinking your pod was not getting scheduled hence pending status, however in your latest scenario it looks like the pod was indeed evicted from the node, which K9s currently does not track. I will fix this on the next push. Thank you for the correction!

derailed on 26 Apr 2019

@brunohms Fixed 0.5.2!

derailed on 27 Apr 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings