Amazon-vpc-cni-k8s: 1.6.1 couldn't get resource list for custom.metrics.k8s.io/v1beta1

Created on 4 May 2020 · 10Comments · Source: aws/amazon-vpc-cni-k8s

Hi, I'm upgraded my CNI from 1.6.0 to 1.6.1, and got the error message:

Starting IPAM daemon in the background ... ok.
ERROR: logging before flag.Parse: E0504 05:28:30.770427       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
Checking for IPAM connectivity ... ok.
Copying CNI plugin binaries and config files ... ok.
Foregrounding IPAM daemon ...
ERROR: logging before flag.Parse: E0504 05:29:30.842407       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:30:30.842925       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:31:30.844446       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:32:30.844295       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:33:30.846749       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:34:30.843623       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:35:30.844096       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:36:30.959437       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:37:30.844725       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E0504 05:38:30.843170       6 memcache.go:147] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>

and even I upgrade my cluster to EKS from 1.15 to 1.16, still got the same error

however, if I downgrade CNI to 1.6.0, then everything works perfectly with EKS 1.15 and 1.16

not sure what I'm missing, I'm using the yaml file that on the release note
https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/release-1.6.1/config/v1.6/aws-k8s-cni.yaml

pod describe doesn't show any error events.

Thank you for help.

Source

terrych0u

👍1

Most helpful comment

@mogren I'm also seeing this with amazon-k8s-cni:v1.6.3, Kubernetes 1.17 on eks.1.

tdmalone on 21 Jul 2020

👍3

All 10 comments

Hi @terrych0u!

We have seen issues with older versions of the Horizontal Pod Autoscaler and the custom metrics-server. The memcache.go file is pulled in by client-go and will log this spurious warning. This has been an issue before, see #486 for some background.

mogren on 8 May 2020

Why this issue is closed? 1.6.2 still having this problem, but 1.6.0 is not, so it looks as regression. My daemonset is not crashing but logs the error:

aws-node ERROR: logging before flag.Parse: E0616 09:16:31.638253 8 memcache.go:147] couldn't get resource list for external.metrics.k8s.io/v1beta1: <nil>

$ kubectl describe -n kube-system ds aws-node
Name:           aws-node
Selector:       k8s-app=aws-node
Node-Selector:  <none>
Labels:         k8s-app=aws-node
Annotations:    deprecated.daemonset.template.generation: 5
                kubectl.kubernetes.io/last-applied-configuration:
                  {"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"k8s-app":"aws-node"},"name":"aws-node","namespace":"kub...
Desired Number of Nodes Scheduled: 6
Current Number of Nodes Scheduled: 6
Number of Nodes Scheduled with Up-to-date Pods: 6
Number of Nodes Scheduled with Available Pods: 6
Number of Nodes Misscheduled: 0
Pods Status:  6 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           k8s-app=aws-node
  Service Account:  aws-node
  Containers:
   aws-node:
    Image:      602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.6.2
    Port:       61678/TCP
    Host Port:  61678/TCP
    Requests:
      cpu:      10m
    Liveness:   exec [/app/grpc-health-probe -addr=:50051] delay=35s timeout=1s period=10s #success=1 #failure=3
    Readiness:  exec [/app/grpc-health-probe -addr=:50051] delay=35s timeout=1s period=10s #success=1 #failure=3
    Environment:
      AWS_VPC_K8S_CNI_LOGLEVEL:    DEBUG
      AWS_VPC_K8S_CNI_VETHPREFIX:  eni
      AWS_VPC_K8S_CNI_MTU:         9001
      MY_NODE_NAME:                 (v1:spec.nodeName)
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /host/var/log from log-dir (rw)
      /var/run/docker.sock from dockersock (rw)
      /var/run/dockershim.sock from dockershim (rw)
  Volumes:
   cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  
   cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
   log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log
    HostPathType:  
   dockersock:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/docker.sock
    HostPathType:  
   dockershim:
    Type:               HostPath (bare host directory volume)
    Path:               /var/run/dockershim.sock
    HostPathType:       
  Priority Class Name:  system-node-critical
Events:                 <none>

LAKostis on 16 Jun 2020

👍2

I'm seeing this issue too.

bergbrains on 14 Jul 2020

@bergbrains With v1.6.3 or some older version? What version of Kubernetes, and what configuration is this?

mogren on 15 Jul 2020

I'm seeing it with v1.6.2 & v1.6.3

sarah97 on 20 Jul 2020

@bergbrains With v1.6.3 or some older version? What version of Kubernetes, and what configuration is this?

I'm deployed in AWS, using EKS 1.17.

bergbrains on 20 Jul 2020

@mogren I'm also seeing this with amazon-k8s-cni:v1.6.3, Kubernetes 1.17 on eks.1.

tdmalone on 21 Jul 2020

👍3

@mogren I'm also seeing this with amazon-k8s-cni:v1.6.3, Kubernetes 1.17 on eks.1.

I have the same setup. However it seems my EKS managed node group's pods do not have this error, only my self managed node's pods have this error. Thought it was worth mentioning since I just started the upgrading process.

hscheib on 24 Aug 2020

If possible can we use https://github.com/aws/amazon-vpc-cni-k8s/issues/486 for reporting this issue? It will be easier I think to coordinate. @tdmalone and @hscheib is the ipamd starting up eventually or is it not starting at all? (can you use #486 to add your comments if possible?)