Amazon-vpc-cni-k8s: ipamd fails to assign IP addresses to containers

Created on 9 Sep 2020 · 5Comments · Source: aws/amazon-vpc-cni-k8s

What happened:
Not sure if any of you have experienced the issue with ipamd leaking IP addresses and eventually failing to get new ones allocated on K8s nodes. But we have noticed this as a recurring issue in one of our small clusters.

The issue was first identified through a failed pod of a DaemonSet blocking the DaemonSet to fully upgrade. Pod on the problematic node gets stuck in the stage of ContainerCreating (output slightly altered for security purposes):

# kubectl get pods -n infra -l release=xyz -o wide --sort-by=status.phase
NAME              READY   STATUS              RESTARTS   AGE     IP             NODE                                         NOMINATED NODE   READINESS GATES
xyz-agent-7f4c7   0/1     ContainerCreating   0          15h     <none>         ip-10-4-185-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-4s4x9   1/1     Running             0          4d8h    10.4.220.xxx   ip-10-4-219-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-5tk7n   1/1     Running             0          2d12h   10.4.193.xxx   ip-10-4-215-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-5dxfr   1/1     Running             0          12h     10.4.138.xxx   ip-10-4-146-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-8cljp   1/1     Running             0          2d12h   10.4.171.xxx   ip-10-4-167-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-dvbhd   1/1     Running             0          3d6h    10.4.193.xxx   ip-10-4-208-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-nn627   1/1     Running             0          3d6h    10.4.144.xxx   ip-10-4-151-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-s9qz8   1/1     Running             0          2d12h   10.4.178.xxx   ip-10-4-178-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-sxbxk   1/1     Running             0          3d7h    10.4.211.xxx   ip-10-4-213-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-v4ck9   1/1     Running             0          4d13h   10.4.169.xxx   ip-10-4-186-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-vc6km   1/1     Running             0          3d6h    10.4.143.xxx   ip-10-4-135-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-vnbcv   1/1     Running             0          3d6h    10.4.190.xxx   ip-10-4-185-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-z2rxf   1/1     Running             0          3d6h    10.4.152.xxx   ip-10-4-130-xxx.us-west-2.compute.internal   <none>           <none>
xyz-agent-zlnvb   1/1     Running             0          3d6h    10.4.218.xxx   ip-10-4-220-xxx.us-west-2.compute.internal   <none>           <none>

If we look at the events around the pod, the look something like:

# kubectl get events -n infra --field-selector=involvedObject.name=xyz-agent-7f4c7 -o json | kubectl describe -f -
Name:             xyz-agent-7f4c7.1631a103c1176ed8
Namespace:        infra
Labels:           <none>
Annotations:      <none>
API Version:      v1
Count:            31243
Event Time:       <nil>
First Timestamp:  2020-09-04T16:19:40Z
Involved Object:
  API Version:       v1
  Kind:              Pod
  Name:              xyz-agent-7f4c7
  Namespace:         infra
  Resource Version:  51945182
  UID:               39f8c74a-1942-4676-842b-0123456789ab
Kind:                Event
Last Timestamp:      2020-09-05T07:49:39Z
Message:             Pod sandbox changed, it will be killed and re-created.
Metadata:
  Creation Timestamp:  2020-09-05T03:09:39Z
  Resource Version:    2357175
  Self Link:           /api/v1/namespaces/infra/events/xyz-agent-7f4c7.1631a103c1176ed8
  UID:                 61982647-e14d-4364-8cca-0123456789ab
Reason:                SandboxChanged
Reporting Component:
Reporting Instance:
Source:
  Component:  kubelet
  Host:       ip-10-4-185-xxx.us-west-2.compute.internal
Type:         Normal
Events:       <none>


Name:             xyz-agent-7f4c7.1631a107e92b8447
Namespace:        infra
Labels:           <none>
Annotations:      <none>
API Version:      v1
Count:            32031
Event Time:       <nil>
First Timestamp:  2020-09-04T16:19:58Z
Involved Object:
  API Version:       v1
  Kind:              Pod
  Name:              xyz-agent-7f4c7
  Namespace:         infra
  Resource Version:  51945182
  UID:               39f8c74a-1942-4676-842b-0123456789ab
Kind:                Event
Last Timestamp:      2020-09-05T08:14:41Z
Message:             (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "899ddd98194db655b4fd62c4bb75ee6212587f18e7493889771ca2f691d5e7b2" network for pod "xyz-agent-7f4c7": networkPlugin cni failed to set up pod "xyz-agent-7f4c7_infra" network: add cmd: failed to assign an IP address to container
Metadata:
  Creation Timestamp:  2020-09-04T16:19:58Z
  Resource Version:    2357767
  Self Link:           /api/v1/namespaces/infra/events/xyz-agent-7f4c7.1631a107e92b8447
  UID:                 0dc1f7b8-eabb-47ca-b0b7-0123456789ab
Reason:                FailedCreatePodSandBox
Reporting Component:
Reporting Instance:
Source:
  Component:  kubelet
  Host:       ip-10-4-185-xxx.us-west-2.compute.internal
Type:         Warning
Events:       <none>

The ENI IP assignment issue results in failure in pod sandbox creation as can be found in the messages above.

On the same node there were actually 2 more pods stuck in the ContainerCreating state:

# kubectl get pods -A --field-selector=spec.nodeName=ip-10-4-185-xxx.us-west-2.compute.internal --no-headers --sort-by=status.phase | awk '{ print $4 }' | uniq -c
   3 ContainerCreating
  18 Running

The issue persists against new pods from the same DaemonSet after we terminated the old ones.

After hopping onto the node, we could find the following error messages in the ipam*.log file:

root@ip-10-4-185-xxx:/var/log/aws-routed-eni# zcat *.gz | jq -c 'select(.level == "error") | {level, caller, msg}' | uniq -c
  90117 {"level":"error","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}

Also, the node has been fully assigned with 8 ENIs, each allocated with 30 IP addresses. This could be verified by the following procedure:

root@ip-10-4-185-xxx:~# eni-to-yaml | yq -y 'map(.["local-ipv4s"] | length) | {totalENIs: length, totalIPs: add}'
totalENIs: 8
totalIPs: 240

(... where eni-to-yaml is a bash function from the following snippet):

# Create this function first:
function eni-to-yaml() {
  local eni_mac_url=http://169.254.169.254/latest/meta-data/network/interfaces/macs/
  curl -s "${eni_mac_url}" | eni_mac_url="${eni_mac_url}" xargs -n 1 -I'{{}}' sh -c \
    'printf -- "- interface-id: %s\n  local-ipv4s:\n" "$(curl -s "${eni_mac_url}"{{}}"interface-id")"; curl -s "${eni_mac_url}"{{}}"local-ipv4s" | sort | awk '\''{ print "  - " $0 }'\'
}

While on a good node, the number of IPs are just 21 (20 + 1 for primary IP):

root@ip-10-4-167-xxx:~# eni-to-yaml | yq -Y 'map(.["local-ipv4s"] | length) | {totalENIs: length, totalIPs: add}'
totalENIs: 1
totalIPs: 21

Environment:

Kubernetes version (use kubectl version): v1.17.9
CNI Version: AWS CNI v1.6.1
OS (e.g: cat /etc/os-release): Debian Stretch
Kernel (e.g. uname -a): 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux

Would appreciate if someone could look into this and provide insights on how this happened, as well as recommendations on how to alleviate/address the issue. Thanks a lot!

needs additional information needs investigation question

Source

jasonzlai

All 5 comments

@jasonzlai Hi, thanks for the detailed issue! If you have the logs for the node with the issue, we would love to take a look at it. Either just email it to me, or send through a support ticket.

In short, "DataStore has no available IP addresses" means that even though the ENIs get attached to the instance, and the secondary IPs gets added to them, they never get added to the "data store", which is the internal cache that ipamd is using to give IPs to pods. Why this happens should be seen in the ipamd.log. The prime suspect is that the setting up the newly created ENI is failing for some reason, this function.

A question to start with, was there ever a "Force detach" on this instance. I was wondering if the instance metadata was stale or if doing curl on the metadata endpoint fails?

We have refactored a lot of this in the latest v1.7.2-rc1 release candidate, and we are running through all our test cases right now.

mogren on 10 Sep 2020

👍1

Hi @jasonzlai , Can you please collect the CNI logs which @mogren is requesting. You can email to me [email protected] if it is fine or please attach it to the support ticket and share us the ticket.

CNI log collection tool [https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting.html]

Thanks.

jayanthvn on 10 Sep 2020

👀1

@jasonzlai Are you still seeing this issue?

mogren on 23 Sep 2020

Hi @jasonzlai

Please reopen the issue if it still exists and please share the CNI logs which help us debug further.

Thank you!

jayanthvn on 14 Oct 2020

Hey @mogren and @jayanthvn! Sorry I forgot to follow up here. Actually we found out the issue was caused by Cilium installed in the same cluster. After Cilium got uninstalled, the issue no longer stands. Thanks a lot again for looking into this!

jasonzlai on 18 Oct 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings