Amazon-vpc-cni-k8s: Elevated rate of `Client.PrivateIpAddressLimitExceeded` errors after 1.5.x to 1.6.1 upgrade

Created on 21 May 2020  Â·  14Comments  Â·  Source: aws/amazon-vpc-cni-k8s

After we upgraded to 1.6.1 version, we see increased rate of Client.PrivateIpAddressLimitExceeded errors for AssignPrivateIpAddresses call.

Error message:

"errorCode": "Client.PrivateIpAddressLimitExceeded",
    "errorMessage": "Number of private addresses will exceed limit.",
    "requestParameters": {
        "networkInterfaceId": "eni-XXXX",
        "privateIpAddressesSet": {},
        "secondaryPrivateIpAddressCount": 29
    },

We use m5.12xlarge which allows to have 30 IP Addresses per interface.

Screen Shot 2020-05-21 at 12 07 58 PM

Most helpful comment

We have also just experienced a similar error with CNI 1.7.1, there is also kube2iam in running in our clusters.

Not seeing the same problem as reported by @uthark - but we are seeing issues with iptables being updated.

It might be worth adding a retry on iptables updates, in the chance that some other resource is modifying them at the same time. Current implementation causes pods to crash & restart.

{"level":"info","ts":"2020-08-30T06:17:37.670Z","caller":"entrypoint.sh","msg":"Copying CNI plugin binaries ... "}
{"level":"info","ts":"2020-08-30T06:17:37.745Z","caller":"entrypoint.sh","msg":"Install CNI binary.."}
{"level":"info","ts":"2020-08-30T06:17:37.851Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2020-08-30T06:17:37.852Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2020-08-30T06:17:38.162Z","caller":"aws-k8s-agent/main.go:28","msg":"Starting L-IPAMD v1.7.1  ..."}
{"level":"info","ts":"2020-08-30T06:17:38.220Z","caller":"aws-k8s-agent/main.go:41","msg":"Testing communication with server"}
{"level":"info","ts":"2020-08-30T06:17:38.220Z","caller":"aws-k8s-agent/main.go:41","msg":"Successful communication with the Cluster! Cluster Version is: v1.16. git version: v1.16.7. git tree state: clean. commit: be3d344ed06bff7a4fc60656200a93c74f31f9a4. platform: linux/amd64"}
{"level":"debug","ts":"2020-08-30T06:17:38.223Z","caller":"ipamd/ipamd.go:318","msg":"Discovered region: eu-central-1"}
{"level":"debug","ts":"2020-08-30T06:17:38.224Z","caller":"awsutils/awsutils.go:281","msg":"Found availability zone: eu-central-1b "}
{"level":"debug","ts":"2020-08-30T06:17:38.224Z","caller":"awsutils/awsutils.go:281","msg":"Discovered the instance primary ip address: 172.22.86.5"}
{"level":"debug","ts":"2020-08-30T06:17:38.225Z","caller":"awsutils/awsutils.go:281","msg":"Found instance-id: i-07c6605d8fd107fd0 "}
{"level":"debug","ts":"2020-08-30T06:17:38.225Z","caller":"awsutils/awsutils.go:281","msg":"Found instance-type: r5.4xlarge "}
{"level":"debug","ts":"2020-08-30T06:17:38.226Z","caller":"awsutils/awsutils.go:281","msg":"Found primary interface's MAC address: 06:49:f0:f3:ea:d6"}
{"level":"debug","ts":"2020-08-30T06:17:38.226Z","caller":"awsutils/awsutils.go:341","msg":"Discovered 1 interfaces."}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"Found device-number: 0 "}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"Found eni: eni-0f152de421b2bee93 "}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"eni-0f152de421b2bee93 is the primary ENI of this instance"}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:281","msg":"Found subnet-id: subnet-02dd602db9eb3d986 "}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"awsutils/awsutils.go:356","msg":"Found sg-048f335b09a077c07, added to ipamd cache"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"awsutils/awsutils.go:362","msg":"Found 172.22.80.0/20, added to ipamd cache"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:327","msg":"Using WARM_IP_TARGET 5"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:337","msg":"Start node init"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:364","msg":"Setting up host network... "}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Trying to find primary interface that has mac : 06:49:f0:f3:ea:d6"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"wait/wait.go:133","msg":"Will attempt to clean up AWS CNI leaked ENIs after waiting 2m28s."}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered interface: ens5, mac: 06:49:f0:f3:ea:d6"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered primary interface: ens5"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:364","msg":"Setting RPF for primary interface: net/ipv4/conf/ens5/rp_filter"}
{"level":"debug","ts":"2020-08-30T06:17:38.230Z","caller":"networkutils/network.go:249","msg":"Found the Link that uses mac address 06:49:f0:f3:ea:d6 and its index is 2 (attempt 1/5)"}
{"level":"debug","ts":"2020-08-30T06:17:38.238Z","caller":"networkutils/network.go:319","msg":"Setup Host Network: loading existing iptables nat SNAT exclusion rules"}
{"level":"debug","ts":"2020-08-30T06:17:38.257Z","caller":"ipamd/ipamd.go:364","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-0 -t nat"}
{"level":"error","ts":"2020-08-30T06:17:38.263Z","caller":"ipamd/ipamd.go:364","msg":"ipt.NewChain error for chain [AWS-SNAT-CHAIN-0]: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}
{"level":"error","ts":"2020-08-30T06:17:38.263Z","caller":"aws-k8s-agent/main.go:28","msg":"Initialization failure: ipamd init: failed to set up host network: host network setup: failed to add chain: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}

Update:

Just read up on the iptables manual, seems like --wait isn't behaving as one would expect.

-w, --wait [seconds]
Wait for the xtables lock. To prevent multiple instances of the program from running concurrently, an attempt will be made to obtain an exclusive lock at launch. By default, the program will exit if the lock cannot be obtained. This option will make the program wait (indefinitely or for optional seconds) until the exclusive lock can be obtained.

Feels like the request isn't being waited indefinitely.

All 14 comments

@uthark Interesting, thanks for reporting! Are you using WARM_IP_TARGET or is this when attaching a new ENI with 29 new secondary IPs? A log dump would be most helpful to figure out what the issue is.

we don't use WARM_IP_TARGET.
We use WARM_ENI_TARGET with value 2.
Looking for logs and check if those are new instances. will post update later.

This is on new instances, please see screenshot with CloudTrail events.
Screen Shot 2020-05-21 at 12 51 28 PM

ipamd.log:

cat ipamd.log
{"level":"error","ts":"2020-05-21T18:36:13.918Z","caller":"ipamd/ipamd.go:354","msg":"ipt.NewChain error for chain [AWS-SNAT-CHAIN-0]: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}
{"level":"error","ts":"2020-05-21T18:36:13.918Z","caller":"aws-k8s-agent/main.go:30","msg":"Initialization failure: ipamd init: failed to set up host network: host network setup: failed to add chain: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}
{"level":"warn","ts":"2020-05-21T18:36:52.541Z","caller":"ipamd/ipamd.go:387","msg":"Not all pods have an IP, trying again in 3 seconds."}
{"level":"warn","ts":"2020-05-21T18:36:55.541Z","caller":"ipamd/ipamd.go:387","msg":"Not all pods have an IP, trying again in 3 seconds."}
{"level":"warn","ts":"2020-05-21T18:36:58.542Z","caller":"ipamd/ipamd.go:387","msg":"Not all pods have an IP, trying again in 3 seconds."}
{"level":"warn","ts":"2020-05-21T18:37:01.542Z","caller":"ipamd/ipamd.go:387","msg":"Not all pods have an IP, trying again in 3 seconds."}
{"level":"warn","ts":"2020-05-21T18:37:04.542Z","caller":"ipamd/ipamd.go:387","msg":"Not all pods have an IP, trying again in 3 seconds."}
{"level":"error","ts":"2020-05-21T18:37:13.345Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"error","ts":"2020-05-21T18:37:13.503Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:13.678Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:13.679Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:14.633Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:14.633Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:15.633Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:15.633Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:16.312Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:16.444Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:16.444Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:16.628Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:16.628Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:17.404Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:17.540Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:17.540Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:17.651Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:17.651Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:18.420Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:18.612Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:18.612Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:18.663Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:18.663Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:19.523Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:19.656Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:19.657Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:20.688Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:20.688Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:21.471Z","caller":"datastore/data_store.go:288","msg":"DataStore has no available IP addresses"}
{"level":"warn","ts":"2020-05-21T18:37:21.602Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:21.602Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:22.679Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:22.679Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:23.710Z","caller":"ipamd/rpc_handler.go:88","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"warn","ts":"2020-05-21T18:37:23.710Z","caller":"ipamd/rpc_handler.go:95","msg":"UnassignPodIPv4Address: Failed to find pod REDACTED
{"level":"error","ts":"2020-05-21T18:37:35.433Z","caller":"ipamd/ipamd.go:762","msg":"Failed to allocate a private IP addresses on ENI eni-02da60df4d99a40db: PrivateIpAddressLimitExceeded: Number of private addresses will exceed limit.\n\tstatus code: 400, request id: 7874f1e2-3565-45e2-bbd2-2871ea0f67ae"}
{"level":"error","ts":"2020-05-21T18:37:48.567Z","caller":"ipamd/ipamd.go:762","msg":"Failed to allocate a private IP addresses on ENI eni-0c247214789d21112: PrivateIpAddressLimitExceeded: Number of private addresses will exceed limit.\n\tstatus code: 400, request id: e2e9c4e8-9d15-445f-ae77-9df1e71fea5b"}

Hmm... The first line is concerning, why is iptables not responding?

{"level":"error","ts":"2020-05-21T18:36:13.918Z","caller":"ipamd/ipamd.go:354","msg":"ipt.NewChain error for chain [AWS-SNAT-CHAIN-0]: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}

@uthark I could see 4 ENIs attached to that instance, and they all have 29 secondary IPs per ENI. The CNI reads the instance metadata to fetch the attached IPs, so there must be some mismatch there.

Hi! 2 extra ENIs were added later by CNI plugin.
About iptables issue - I'll take a look, but can't quickly identify the cause.

# iptables --version
iptables v1.6.0
# uname -r
4.15.0-1065-aws

Also, we use kube2iam on the host with --iptables=true, might be related to iptables error.

CNI errors started to appear after rollout 1.6.x

We have also just experienced a similar error with CNI 1.7.1, there is also kube2iam in running in our clusters.

Not seeing the same problem as reported by @uthark - but we are seeing issues with iptables being updated.

It might be worth adding a retry on iptables updates, in the chance that some other resource is modifying them at the same time. Current implementation causes pods to crash & restart.

{"level":"info","ts":"2020-08-30T06:17:37.670Z","caller":"entrypoint.sh","msg":"Copying CNI plugin binaries ... "}
{"level":"info","ts":"2020-08-30T06:17:37.745Z","caller":"entrypoint.sh","msg":"Install CNI binary.."}
{"level":"info","ts":"2020-08-30T06:17:37.851Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2020-08-30T06:17:37.852Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2020-08-30T06:17:38.162Z","caller":"aws-k8s-agent/main.go:28","msg":"Starting L-IPAMD v1.7.1  ..."}
{"level":"info","ts":"2020-08-30T06:17:38.220Z","caller":"aws-k8s-agent/main.go:41","msg":"Testing communication with server"}
{"level":"info","ts":"2020-08-30T06:17:38.220Z","caller":"aws-k8s-agent/main.go:41","msg":"Successful communication with the Cluster! Cluster Version is: v1.16. git version: v1.16.7. git tree state: clean. commit: be3d344ed06bff7a4fc60656200a93c74f31f9a4. platform: linux/amd64"}
{"level":"debug","ts":"2020-08-30T06:17:38.223Z","caller":"ipamd/ipamd.go:318","msg":"Discovered region: eu-central-1"}
{"level":"debug","ts":"2020-08-30T06:17:38.224Z","caller":"awsutils/awsutils.go:281","msg":"Found availability zone: eu-central-1b "}
{"level":"debug","ts":"2020-08-30T06:17:38.224Z","caller":"awsutils/awsutils.go:281","msg":"Discovered the instance primary ip address: 172.22.86.5"}
{"level":"debug","ts":"2020-08-30T06:17:38.225Z","caller":"awsutils/awsutils.go:281","msg":"Found instance-id: i-07c6605d8fd107fd0 "}
{"level":"debug","ts":"2020-08-30T06:17:38.225Z","caller":"awsutils/awsutils.go:281","msg":"Found instance-type: r5.4xlarge "}
{"level":"debug","ts":"2020-08-30T06:17:38.226Z","caller":"awsutils/awsutils.go:281","msg":"Found primary interface's MAC address: 06:49:f0:f3:ea:d6"}
{"level":"debug","ts":"2020-08-30T06:17:38.226Z","caller":"awsutils/awsutils.go:341","msg":"Discovered 1 interfaces."}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"Found device-number: 0 "}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"Found eni: eni-0f152de421b2bee93 "}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:341","msg":"eni-0f152de421b2bee93 is the primary ENI of this instance"}
{"level":"debug","ts":"2020-08-30T06:17:38.227Z","caller":"awsutils/awsutils.go:281","msg":"Found subnet-id: subnet-02dd602db9eb3d986 "}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"awsutils/awsutils.go:356","msg":"Found sg-048f335b09a077c07, added to ipamd cache"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"awsutils/awsutils.go:362","msg":"Found 172.22.80.0/20, added to ipamd cache"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:327","msg":"Using WARM_IP_TARGET 5"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:337","msg":"Start node init"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:364","msg":"Setting up host network... "}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Trying to find primary interface that has mac : 06:49:f0:f3:ea:d6"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"wait/wait.go:133","msg":"Will attempt to clean up AWS CNI leaked ENIs after waiting 2m28s."}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered interface: lo, mac: "}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered interface: ens5, mac: 06:49:f0:f3:ea:d6"}
{"level":"info","ts":"2020-08-30T06:17:38.228Z","caller":"networkutils/network.go:222","msg":"Discovered primary interface: ens5"}
{"level":"debug","ts":"2020-08-30T06:17:38.228Z","caller":"ipamd/ipamd.go:364","msg":"Setting RPF for primary interface: net/ipv4/conf/ens5/rp_filter"}
{"level":"debug","ts":"2020-08-30T06:17:38.230Z","caller":"networkutils/network.go:249","msg":"Found the Link that uses mac address 06:49:f0:f3:ea:d6 and its index is 2 (attempt 1/5)"}
{"level":"debug","ts":"2020-08-30T06:17:38.238Z","caller":"networkutils/network.go:319","msg":"Setup Host Network: loading existing iptables nat SNAT exclusion rules"}
{"level":"debug","ts":"2020-08-30T06:17:38.257Z","caller":"ipamd/ipamd.go:364","msg":"Setup Host Network: iptables -N AWS-SNAT-CHAIN-0 -t nat"}
{"level":"error","ts":"2020-08-30T06:17:38.263Z","caller":"ipamd/ipamd.go:364","msg":"ipt.NewChain error for chain [AWS-SNAT-CHAIN-0]: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}
{"level":"error","ts":"2020-08-30T06:17:38.263Z","caller":"aws-k8s-agent/main.go:28","msg":"Initialization failure: ipamd init: failed to set up host network: host network setup: failed to add chain: running [/usr/sbin/iptables -t nat -N AWS-SNAT-CHAIN-0 --wait]: exit status 4: iptables: Resource temporarily unavailable.\n"}

Update:

Just read up on the iptables manual, seems like --wait isn't behaving as one would expect.

-w, --wait [seconds]
Wait for the xtables lock. To prevent multiple instances of the program from running concurrently, an attempt will be made to obtain an exclusive lock at launch. By default, the program will exit if the lock cannot be obtained. This option will make the program wait (indefinitely or for optional seconds) until the exclusive lock can be obtained.

Feels like the request isn't being waited indefinitely.

I tried to run 1.7.0 — grpc server doesn't start, so readiness probe fails.

@uthark do you have any logs showing why it didn't start up? Did you apply the new config, or just update the image tag?

We have merged a number of fixes lately that will be available in v1.7.2 that should resolve the PrivateIpAddressLimitExceeded issue. #1174 specifically.

New config, with init container. It looks like it starts, but grpc healthcheck fails. Initial delay is configured to 90 seconds.

Ok, thanks @uthark. I wonder if it could get stuck on iptables or creating routes. I'd be happy to take a look at the ipamd log when this happens.

Also we have a release candidate out with the fixes mentioned above v1.7.2-rc1

@uthark While comparing v1.6.3 and v1.7.2-rc1 when scaling from 0 to 5000 pods to 0 again, we saw 206 Client.PrivateIpAddressLimitExceeded on v1.6.3, but 0 on v1.7.2-rc1. Closing this issue since #1174 has resolved it.

Sounds good to me. I'll glve 1.7.2-rc a try.

Was this page helpful?
0 / 5 - 0 ratings